FahCore returned: FAILED_3 (255 = 0xff)

Moderators: Site Moderators, FAHC Science Team

Post Reply
JimF
Posts: 652
Joined: Thu Jan 21, 2010 2:03 pm

FahCore returned: FAILED_3 (255 = 0xff)

Post by JimF »

EDIT0: This probably should be under some other topic. Move it if necessary.

I am getting a lot of work units with this error. I expect a reboot will fix it, but here is the log thus far:

Code: Select all

01:08:57:WU01:FS01:Starting
01:08:57:WU01:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 4146770 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
01:08:57:WU01:FS01:Started FahCore on PID 22439
01:08:57:WU01:FS01:FahCore 0x22 started
01:08:58:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
01:08:58:WU01:FS01:Starting
01:08:58:WU01:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 4146770 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
01:08:58:WU01:FS01:Started FahCore on PID 22441
01:08:58:WU01:FS01:FahCore 0x22 started
01:08:58:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
01:09:58:WU01:FS01:Starting
01:09:58:WU01:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 4146770 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
01:09:58:WU01:FS01:Started FahCore on PID 25112
01:09:58:WU01:FS01:FahCore 0x22 started
01:09:58:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
01:10:58:WU01:FS01:Starting
01:10:58:WU01:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 4146770 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
01:10:58:WU01:FS01:Started FahCore on PID 27785
01:10:58:WU01:FS01:FahCore 0x22 started
01:10:58:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
01:11:58:WU01:FS01:Starting
01:11:58:WU01:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 4146770 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
01:11:58:WU01:FS01:Started FahCore on PID 30457
01:11:58:WU01:FS01:FahCore 0x22 started
01:11:58:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
01:11:58:WARNING:WU01:FS01:Too many errors, failing
01:11:58:WU01:FS01:Sending unit results: id:01 state:SEND error:FAILED project:18103 run:0 clone:34 gen:194 core:0x22 unit:0x00000022000000c2000046b700000000
01:11:58:WU01:FS01:Connecting to 34.72.228.44:8080
01:11:58:WU01:FS01:Server responded WORK_ACK (400)
01:11:58:WU01:FS01:Cleaning up
01:11:58:WU00:FS01:Connecting to assign1.foldingathome.org:80
01:11:59:WU00:FS01:Assigned to work server 140.163.4.210
01:11:59:WU00:FS01:Requesting new work unit for slot 01: gpu:1:0 GP104 [GeForce GTX 1070] 6463 from 140.163.4.210
01:11:59:WU00:FS01:Connecting to 140.163.4.210:8080
01:11:59:WU00:FS01:Downloading 3.73MiB
01:12:00:WU00:FS01:Download complete
01:12:00:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:16469 run:0 clone:116 gen:140 core:0x22 unit:0x000000740000008c0000405500000000
01:12:00:WU00:FS01:Starting
01:12:00:WU00:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 00 -suffix 01 -version 706 -lifeline 4146770 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
01:12:00:WU00:FS01:Started FahCore on PID 30459
01:12:00:WU00:FS01:FahCore 0x22 started
01:12:00:WARNING:WU00:FS01:FahCore returned: FAILED_3 (255 = 0xff)
01:12:00:WU00:FS01:Starting
01:12:00:WU00:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 00 -suffix 01 -version 706 -lifeline 4146770 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
01:12:00:WU00:FS01:Started FahCore on PID 30577
01:12:00:WU00:FS01:FahCore 0x22 started
01:12:00:WARNING:WU00:FS01:FahCore returned: FAILED_3 (255 = 0xff)
01:13:00:WU00:FS01:Starting
01:13:00:WU00:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 00 -suffix 01 -version 706 -lifeline 4146770 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
01:13:00:WU00:FS01:Started FahCore on PID 33248
01:13:00:WU00:FS01:FahCore 0x22 started
01:13:01:WARNING:WU00:FS01:FahCore returned: FAILED_3 (255 = 0xff)
01:14:00:WU00:FS01:Starting
01:14:00:WU00:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 00 -suffix 01 -version 706 -lifeline 4146770 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
01:14:00:WU00:FS01:Started FahCore on PID 35804
01:14:00:WU00:FS01:FahCore 0x22 started
01:14:01:WARNING:WU00:FS01:FahCore returned: FAILED_3 (255 = 0xff)
01:15:00:WU00:FS01:Starting
01:15:00:WU00:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 00 -suffix 01 -version 706 -lifeline 4146770 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
01:15:00:WU00:FS01:Started FahCore on PID 38475
01:15:00:WU00:FS01:FahCore 0x22 started
01:15:01:WARNING:WU00:FS01:FahCore returned: FAILED_3 (255 = 0xff)
01:15:01:WARNING:WU00:FS01:Too many errors, failing
01:15:01:WU00:FS01:Sending unit results: id:00 state:SEND error:FAILED project:16469 run:0 clone:116 gen:140 core:0x22 unit:0x000000740000008c0000405500000000
01:15:01:WU00:FS01:Connecting to 140.163.4.210:8080
01:15:01:WU00:FS01:Server responded WORK_ACK (400)
01:15:01:WU00:FS01:Cleaning up
01:15:01:WU01:FS01:Connecting to assign1.foldingathome.org:80
01:15:01:WU01:FS01:Assigned to work server 207.53.233.146
01:15:01:WU01:FS01:Requesting new work unit for slot 01: gpu:1:0 GP104 [GeForce GTX 1070] 6463 from 207.53.233.146
01:15:01:WU01:FS01:Connecting to 207.53.233.146:8080
01:15:01:WU01:FS01:Downloading 2.47MiB
01:15:02:WU01:FS01:Download complete
01:15:02:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:17806 run:13 clone:52 gen:87 core:0x22 unit:0x00000034000000570000458e0000000d
01:15:02:WU01:FS01:Starting
01:15:02:WU01:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 4146770 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
01:15:02:WU01:FS01:Started FahCore on PID 38596
01:15:02:WU01:FS01:FahCore 0x22 started
01:15:02:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
01:15:02:WU01:FS01:Starting
01:15:02:WU01:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 4146770 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
01:15:02:WU01:FS01:Started FahCore on PID 38598
01:15:02:WU01:FS01:FahCore 0x22 started
01:15:02:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
01:16:02:WU01:FS01:Starting
01:16:02:WU01:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 4146770 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
01:16:02:WU01:FS01:Started FahCore on PID 41271
01:16:02:WU01:FS01:FahCore 0x22 started
01:16:03:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
01:17:02:WU01:FS01:Starting
01:17:02:WU01:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 4146770 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
01:17:02:WU01:FS01:Started FahCore on PID 43945
01:17:02:WU01:FS01:FahCore 0x22 started
01:17:03:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
01:18:02:WU01:FS01:Starting
01:18:02:WU01:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 706 -lifeline 4146770 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
01:18:02:WU01:FS01:Started FahCore on PID 46616
01:18:02:WU01:FS01:FahCore 0x22 started
01:18:03:WARNING:WU01:FS01:FahCore returned: FAILED_3 (255 = 0xff)
01:18:03:WARNING:WU01:FS01:Too many errors, failing
01:18:03:WU01:FS01:Sending unit results: id:01 state:SEND error:FAILED project:17806 run:13 clone:52 gen:87 core:0x22 unit:0x00000034000000570000458e0000000d
01:18:03:WU01:FS01:Connecting to 207.53.233.146:8080
01:18:03:WU01:FS01:Server responded WORK_ACK (400)
01:18:03:WU01:FS01:Cleaning up
01:18:03:WU00:FS01:Connecting to assign1.foldingathome.org:80
01:18:03:WU00:FS01:Assigned to work server 128.252.203.11
01:18:03:WU00:FS01:Requesting new work unit for slot 01: gpu:1:0 GP104 [GeForce GTX 1070] 6463 from 128.252.203.11
01:18:03:WU00:FS01:Connecting to 128.252.203.11:8080
01:18:14:WU00:FS01:Downloading 26.54MiB
01:18:19:WU00:FS01:Download complete
01:18:19:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:18202 run:12465 clone:4 gen:22 core:0x22 unit:0x00000004000000160000471a000030b1
01:18:19:WU00:FS01:Starting
01:18:19:WU00:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 00 -suffix 01 -version 706 -lifeline 4146770 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
01:18:19:WU00:FS01:Started FahCore on PID 47314
01:18:19:WU00:FS01:FahCore 0x22 started
01:18:19:WARNING:WU00:FS01:FahCore returned: FAILED_3 (255 = 0xff)
01:18:19:WU00:FS01:Starting
01:18:19:WU00:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 00 -suffix 01 -version 706 -lifeline 4146770 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
01:18:19:WU00:FS01:Started FahCore on PID 47316
01:18:19:WU00:FS01:FahCore 0x22 started
01:18:20:WARNING:WU00:FS01:FahCore returned: FAILED_3 (255 = 0xff)
01:19:19:WU00:FS01:Starting
01:19:19:WU00:FS01:Running FahCore: /bin/FAHCoreWrapper /var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -dir 00 -suffix 01 -version 706 -lifeline 4146770 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
01:19:19:WU00:FS01:Started FahCore on PID 49988
01:19:19:WU00:FS01:FahCore 0x22 started
01:19:20:WARNING:WU00:FS01:FahCore returned: FAILED_3 (255 = 0xff)

EDIT: A reboot DID NOT fix it.

Code: Select all

01:23:33:Successfully acquired database lock
[93m01:23:33:WARNING:FS01:Disabling beta GPU slot 01: gpu:1:0.  Beta GPUs can be tested for no points by setting ``gpu-beta=true`` in the configuration.[0m
[91m01:23:33:ERROR:No valid folding configuration[0m
[93m01:23:33:WARNING:WU01:No longer matches Slot 1's configuration and there are no other matching slots, dumping[0m
01:23:33:WU01:FS01:Sending unit results: id:01 state:SEND error:DUMPED project:16468 run:0 clone:19 gen:155 core:0x22 unit:0x000000130000009b0000405400000000
01:23:33:WU01:FS01:Connecting to 140.163.4.210:8080
[93m01:23:33:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80[0m
01:23:33:WU01:FS01:Connecting to 140.163.4.210:80
[93m01:23:34:WARNING:WU01:FS01:Exception: Failed to send results to work server: Failed to connect to 140.163.4.210:80: Network is unreachable[0m
01:23:34:WU01:FS01:Trying to send results to collection server
01:23:34:WU01:FS01:Connecting to 140.163.4.200:8080
[93m01:23:34:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80[0m
01:23:34:WU01:FS01:Connecting to 140.163.4.200:80
[91m01:23:35:ERROR:WU01:FS01:Exception: Failed to connect to 140.163.4.200:80: Network is unreachable[0m
01:23:35:WU01:FS01:Sending unit results: id:01 state:SEND error:DUMPED project:16468 run:0 clone:19 gen:155 core:0x22 unit:0x000000130000009b0000405400000000
01:23:35:WU01:FS01:Connecting to 140.163.4.210:8080
[93m01:23:35:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80[0m
01:23:35:WU01:FS01:Connecting to 140.163.4.210:80
[93m01:23:35:WARNING:WU01:FS01:Exception: Failed to send results to work server: Failed to connect to 140.163.4.210:80: Network is unreachable[0m
01:23:35:WU01:FS01:Trying to send results to collection server
01:23:35:WU01:FS01:Connecting to 140.163.4.200:8080
[93m01:23:35:WARNING:WU01:FS01:WorkServer connection failed on port 8080 trying 80[0m
01:23:35:WU01:FS01:Connecting to 140.163.4.200:80
[91m01:23:36:ERROR:WU01:FS01:Exception: Failed to connect to 140.163.4.200:80: Network is unreachable[0m
01:24:35:WU01:FS01:Sending unit results: id:01 state:SEND error:DUMPED project:16468 run:0 clone:19 gen:155 core:0x22 unit:0x000000130000009b0000405400000000
01:24:35:WU01:FS01:Connecting to 140.163.4.210:8080
01:24:36:WU01:FS01:Server responded WORK_ACK (400)
01:24:36:WU01:FS01:Cleaning up
I have seen this type of thing before. I have to uninstall FAH, and then reinstall it (Ubuntu 20.04.2).
It may have something to do with python, but that is not clear. I had to downgrade to python version 2 to install it originally.

EDIT2: A reinstall fixed it. This happens every couple of months. It will be folding away, and then fail in operation. There is no apparent reason for it.
Curiously, I am able to reinstall it after upgrading python to version 3. It is now at 3.8.10 after doing an Ubuntu update.
Last edited by Joe_H on Mon Jun 13, 2022 12:18 am, edited 1 time in total.
Reason: change Quote tags to Code
aetch
Posts: 447
Joined: Thu Jun 25, 2020 3:04 pm
Location: Between chair and keyboard

Re: FahCore returned: FAILED_3 (255 = 0xff)

Post by aetch »

With my ubuntu system I found that when I wanted to do maintenance I had to ensure it had finished and uploaded all of its work. After running the system 24/7 for a couple of months I found that a simple reboot would break the system. I think the system would install certain updates while running but their damage wouldn't become evident until the system rebooted. I suspect something similar happened to your system but the damage became evident during operation.
Folding Rigs - None (25-Jun-2022)

ImageImage
JimF
Posts: 652
Joined: Thu Jan 21, 2010 2:03 pm

Re: FahCore returned: FAILED_3 (255 = 0xff)

Post by JimF »

aetch wrote:With my ubuntu system I found that when I wanted to do maintenance I had to ensure it had finished and uploaded all of its work. After running the system 24/7 for a couple of months I found that a simple reboot would break the system. I think the system would install certain updates while running but their damage wouldn't become evident until the system rebooted. I suspect something similar happened to your system but the damage became evident during operation.
Yes, I have found that updates can cause problems. That is why I turn off all updates until I am ready to do a reboot. So that wasn't the cause for this, unless they do something behind my back.
It happens on all my Ubuntu machines running GPUs (usually six of them). I have not seen it on the CPUs yet, but they have not been running so long.
toTOW
Site Moderator
Posts: 6296
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: FahCore returned: FAILED_3 (255 = 0xff)

Post by toTOW »

I think you'll have to reinstall your GPU drivers ... I guess you got a kernel update and it didn't update the GPU driver modules links.

It happens on my cloud instance when they update the image with a new kernel ... I have to run those two commands and reboot :
sudo apt-get install gcc make linux-headers-$(uname -r)
sudo ./NVIDIA-Linux-x86_64-460.32.03.run
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
JimF
Posts: 652
Joined: Thu Jan 21, 2010 2:03 pm

Re: FahCore returned: FAILED_3 (255 = 0xff)

Post by JimF »

Thanks, but it is working fine now. I just had to re-install FAH, after doing an Ubuntu update.
If it needed to update the drivers, it did so without problem.

And, as I stated, the problem started before I did any updates.
JohnChodera
Pande Group Member
Posts: 470
Joined: Fri Feb 22, 2013 9:59 pm

Re: FahCore returned: FAILED_3 (255 = 0xff)

Post by JohnChodera »

@JimF: That's super weird. If this happens again, the best way to debug is to try to capture one of the core output logs, though that's hard to do because it exits so quickly. We can look up whatever gets returned to the server on our side, but you'd have to find the project manager corresponding to the returned faulty WU.

You can also run the core directly and see if it is segfaulting, which may be easier. For you, this would be running

Code: Select all

/var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22
/var/snap/folding-at-home-fcole90/common/cores/cores.foldingathome.org/lin/64bit/22-0.0.13/Core_22.fah/FahCore_22 -info
If this works, we can provide you with some test WUs to run locally and report back.
toTOW
Site Moderator
Posts: 6296
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: FahCore returned: FAILED_3 (255 = 0xff)

Post by toTOW »

You could also try to start the core manually from a Terminal to see the real error ... the one captured by the client is definitely useless ...
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Post Reply