12474 crashes repeatedly

Moderators: Site Moderators, FAHC Science Team

Post Reply
bikeaddict
Posts: 215
Joined: Sun May 03, 2020 1:20 am

12474 crashes repeatedly

Post by bikeaddict »

On two machines, project 12474 crashes repeatedly with FahCore returned: INTERRUPTED (102 = 0x66). Short excerpts of logs below.

Code: Select all

03:42:42:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:12474 run:24 clone:5 gen:104 core:0xa8 unit:0x680000000500000018000000ba300000
03:42:42:WU01:FS00:Starting
03:42:42:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.12/Core_a8.fah/FahCore_a8 -dir 01 -suffix 01 -version 706 -lifeline 7133 -checkpoint 15 -np 31
03:42:42:WU01:FS00:Started FahCore on PID 997539
03:42:42:WU01:FS00:Core PID:997543
03:42:42:WU01:FS00:FahCore 0xa8 started
03:42:43:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
03:42:43:WU01:FS00:Starting
03:42:43:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.12/Core_a8.fah/FahCore_a8 -dir 01 -suffix 01 -version 706 -lifeline 7133 -checkpoint 15 -np 31
03:42:43:WU01:FS00:Started FahCore on PID 997561
03:42:43:WU01:FS00:Core PID:997565
03:42:43:WU01:FS00:FahCore 0xa8 started
03:42:44:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
03:42:52:WU00:FS02:0x24:Completed 750000 out of 5000000 steps (15%)
03:42:52:WU00:FS02:0x24:Checkpoint completed at step 750000
03:43:43:WU01:FS00:Starting
03:43:43:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.12/Core_a8.fah/FahCore_a8 -dir 01 -suffix 01 -version 706 -lifeline 7133 -checkpoint 15 -np 31
03:43:43:WU01:FS00:Started FahCore on PID 997582
03:43:43:WU01:FS00:Core PID:997586
03:43:43:WU01:FS00:FahCore 0xa8 started
03:43:44:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
03:44:43:WU01:FS00:Starting
03:44:43:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.12/Core_a8.fah/FahCore_a8 -dir 01 -suffix 01 -version 706 -lifeline 7133 -checkpoint 15 -np 31
03:44:43:WU01:FS00:Started FahCore on PID 997604
03:44:43:WU01:FS00:Core PID:997608
03:44:43:WU01:FS00:FahCore 0xa8 started
03:44:44:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
03:44:59:WU00:FS02:0x24:Completed 800000 out of 5000000 steps (16%)

Code: Select all

07:06:26:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:12474 run:12 clone:6 gen:80 core:0xa8 unit:0x50000000060000000c000000ba300000
07:06:26:WU02:FS01:Starting
07:06:26:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.12/Core_a8.fah/FahCore_a8 -dir 02 -suffix 01 -version 706 -lifeline 1811 -checkpoint 15 -np 31
07:06:26:WU02:FS01:Started FahCore on PID 1366389
07:06:26:WU02:FS01:Core PID:1366393
07:06:26:WU02:FS01:FahCore 0xa8 started
07:06:26:WU02:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
07:06:27:WU02:FS01:Starting
07:06:27:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.12/Core_a8.fah/FahCore_a8 -dir 02 -suffix 01 -version 706 -lifeline 1811 -checkpoint 15 -np 31
07:06:27:WU02:FS01:Started FahCore on PID 1366410
07:06:27:WU02:FS01:Core PID:1366414
07:06:27:WU02:FS01:FahCore 0xa8 started
07:06:27:WU02:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
07:06:52:WU01:FS00:0x23:Completed 87500 out of 1250000 steps (7%)
07:07:27:WU02:FS01:Starting
07:07:27:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.12/Core_a8.fah/FahCore_a8 -dir 02 -suffix 01 -version 706 -lifeline 1811 -checkpoint 15 -np 31
07:07:27:WU02:FS01:Started FahCore on PID 1366432
07:07:27:WU02:FS01:Core PID:1366436
07:07:27:WU02:FS01:FahCore 0xa8 started
07:07:27:WU02:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
07:07:34:WU01:FS00:0x23:Completed 100000 out of 1250000 steps (8%)
07:07:35:WU01:FS00:0x23:Checkpoint completed at step 100000
07:08:17:WU01:FS00:0x23:Completed 112500 out of 1250000 steps (9%)
07:08:27:WU02:FS01:Starting
07:08:27:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.12/Core_a8.fah/FahCore_a8 -dir 02 -suffix 01 -version 706 -lifeline 1811 -checkpoint 15 -np 31
07:08:27:WU02:FS01:Started FahCore on PID 1366454
07:08:27:WU02:FS01:Core PID:1366458
07:08:27:WU02:FS01:FahCore 0xa8 started
07:08:27:WU02:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
07:08:59:WU01:FS00:0x23:Completed 125000 out of 1250000 steps (10%)
07:08:59:WU01:FS00:0x23:Checkpoint completed at step 125000
muziqaz
Posts: 1734
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: 12474 crashes repeatedly

Post by muziqaz »

Permission issue?
Unstable machine?
Try deleting /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.12/Core_a8.fah/FahCore_a8
directory and restart fah-client, and let it download the core again
FAH Omega tester
Image
bikeaddict
Posts: 215
Joined: Sun May 03, 2020 1:20 am

Re: 12474 crashes repeatedly

Post by bikeaddict »

Every other project on these machines has been fine for over a year.

Also curious that one WU was assigned to another user at almost the same time who completed it in 6.5 minutes for 1.89M points.

https://apps.foldingathome.org/wu#proje ... =5&gen=104
arisu
Posts: 488
Joined: Mon Feb 24, 2025 11:11 pm

Re: 12474 crashes repeatedly

Post by arisu »

bikeaddict wrote: Tue May 20, 2025 1:32 pm Every other project on these machines has been fine for over a year.

Also curious that one WU was assigned to another user at almost the same time who completed it in 6.5 minutes for 1.89M points.

https://apps.foldingathome.org/wu#proje ... =5&gen=104
That has to be a server bug. That equates to over 421M PPD on a CPU, which is plainly impossible. When that user returned their WU, it was re-sent to you, so that means they must have failed it and the server improperly credited their failed/dumped WU as a success.
bikeaddict
Posts: 215
Joined: Sun May 03, 2020 1:20 am

Re: 12474 crashes repeatedly

Post by bikeaddict »

Another WU had similarly odd behavior. First client finished task in two seconds for 65K points. Second client finished in eight minutes for 1.6M points. Both were supposedly successful, but it was still sent to me and it gave the crash loop.

https://apps.foldingathome.org/wu#proje ... e=2&gen=63
arisu
Posts: 488
Joined: Mon Feb 24, 2025 11:11 pm

Re: 12474 crashes repeatedly

Post by arisu »

So the crash loop that you're experiencing is only being caused by WUs that also have this odd credit behavior?
Nicolas_orleans
Posts: 131
Joined: Wed Aug 08, 2012 3:08 am

Re: 12474 crashes repeatedly

Post by Nicolas_orleans »

The vast.ai instance I am running shows a unique and odd behavior also for this very single project, on all 3 WUs received that failed at startup. Running on Ubuntu 24.04.

Code: Select all

09:19:05:I1:WU323:Requesting WU assignment for user Nicolas_orleans team 33
09:19:06:I1:WU323:Received WU assignment gsN15JmmfSCaMjbRK9TxKKp1ni_rsyptPIJendXpJsU
09:19:06:I1:WU323:Downloading WU
09:19:08:I1:WU323:DOWNLOAD 3% 167.02KiB of 5.68MiB
09:19:09:I1:WU323:DOWNLOAD 38% 2.18MiB of 5.68MiB
09:19:10:I1:WU323:Received WU P12474 R8 C5 G79
09:19:10:I3:WU323:Started FahCore on PID 199222
09:19:11:E :WU323:Core was killed
09:19:11:E :WU323:Core returned FAILED_1 (0)
09:19:11:E :WU323:The folding core did not produce any log output. This indicates that the core is not functional on your system. Check for missing libraries or GPU drivers. Make a post about your issue on https://foldingforum.org/ to get more help.
09:19:11:E :WU323:Run did not produce any results. Dumping WU
09:19:11:I1:WU323:Sending dump report
09:19:12:I1:WU323:Dumped
muziqaz
Posts: 1734
Joined: Sun Dec 16, 2007 6:22 pm
Hardware configuration: 9950x, 7950x3D, 5950x, 5800x3D
7900xtx, RX9070, Radeon 7, 5700xt, 6900xt, RX 550 640SP
Location: London
Contact:

Re: 12474 crashes repeatedly

Post by muziqaz »

Reported to researcher. This seems to be Linux only for now. Something might have broke in WU generation on their server
FAH Omega tester
Image
arisu
Posts: 488
Joined: Mon Feb 24, 2025 11:11 pm

Re: 12474 crashes repeatedly

Post by arisu »

Can someone send me a copy of a wudata_01.dat for this so I can test it on my Linux system and try to diagnose the issue (if it's a Linux-only issue)?
toTOW
Site Moderator
Posts: 6442
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: 12474 crashes repeatedly

Post by toTOW »

I'm getting some bad WUs from this project too ...
19:29:23:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:12474 run:1 clone:1 gen:58 core:0xa8 unit:0x3a0000000100000001000000ba300000
Stuck in an endless loop :
19:48:27:WU01:FS00:Starting
19:48:27:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.12/Core_a8.fah/FahCore_a8 -dir 01 -suffix 01 -version 706 -lifeline 1126 -checkpoint 15 -np 12
19:48:27:WU01:FS00:Started FahCore on PID 20871
19:48:27:WU01:FS00:Core PID:20875
19:48:27:WU01:FS00:FahCore 0xa8 started
19:48:28:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Post Reply