Losing a WU unit

It seems that a lot of GPU problems revolve around specific versions of drivers. Though AMD has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

Post Reply
LPH
Posts: 3
Joined: Mon May 03, 2021 5:37 pm

Losing a WU unit

Post by LPH »

Hi!
I'm having a problem, I dont know if it is specific WU's like 16607 or some problem with my driver? Sometimes I think my driver crashes and I lose all progress on a WU, I thought the checkpoint system saves my progress and I can continue from there, but the WU is then discarded as a bad_work_unit and I lose all progress/cant fold it anymore. Can someone help me? it is not the first time this is happening.

Code: Select all

*********************** Log Started 2021-05-03T16:28:27Z ***********************
16:28:27:******************************* libFAH ********************************
16:28:27:           Date: Oct 20 2020
16:28:27:           Time: 13:36:55
16:28:27:       Revision: 5ca109d295a6245e2a2f590b3d0085ad5e567aeb
16:28:27:         Branch: master
16:28:27:       Compiler: Visual C++ 2015
16:28:27:        Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
16:28:27:       Platform: win32 10
16:28:27:           Bits: 32
16:28:27:           Mode: Release
16:28:27:****************************** FAHClient ******************************
16:28:27:        Version: 7.6.21
16:28:27:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
16:28:27:      Copyright: 2020 foldingathome.org
16:28:27:       Homepage: https://foldingathome.org/
16:28:27:           Date: Oct 20 2020
16:28:27:           Time: 13:41:04
16:28:27:       Revision: 6efbf0e138e22d3963e6a291f78dcb9c6422a278
16:28:27:         Branch: master
16:28:27:       Compiler: Visual C++ 2015
16:28:27:        Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
16:28:27:       Platform: win32 10
16:28:27:           Bits: 32
16:28:27:           Mode: Release
16:28:27:         Config: C:\ProgramData\FAHClient\config.xml
16:28:27:******************************** CBang ********************************
16:28:27:           Date: Oct 20 2020
16:28:27:           Time: 11:36:18
16:28:27:       Revision: 7e4ce85225d7eaeb775e87c31740181ca603de60
16:28:27:         Branch: master
16:28:27:       Compiler: Visual C++ 2015
16:28:27:        Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
16:28:27:       Platform: win32 10
16:28:27:           Bits: 32
16:28:27:           Mode: Release
16:28:27:******************************* System ********************************
16:28:27:            CPU: AMD Ryzen 5 3600 6-Core Processor
16:28:27:         CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
16:28:27:           CPUs: 12
16:28:27:         Memory: 15.93GiB
16:28:27:    Free Memory: 13.77GiB
16:28:27:        Threads: WINDOWS_THREADS
16:28:27:     OS Version: 6.2
16:28:27:    Has Battery: false
16:28:27:     On Battery: false
16:28:27:     UTC Offset: 2
16:28:27:            PID: 9584
16:28:27:            CWD: C:\ProgramData\FAHClient
16:28:27:  Win32 Service: false
16:28:27:             OS: Windows 10 Enterprise
16:28:27:        OS Arch: AMD64
16:28:27:           GPUs: 1
16:28:27:          GPU 0: Bus:40 Slot:0 Func:0 AMD:6 Navi 10 [Radeon RX 5600 OEM/5600
16:28:27:                 XT/5700/5700 XT]
16:28:27:           CUDA: Not detected: Failed to open dynamic library 'nvcuda.dll': Das
16:28:27:                 angegebene Modul wurde nicht gefunden.
16:28:27:
16:28:27:OpenCL Device 0: Platform:0 Device:0 Bus:40 Slot:0 Compute:1.2 Driver:3240.6
16:28:27:***********************************************************************
16:28:27:<config>
16:28:27:  <!-- Network -->
16:28:27:  <proxy v=':8080'/>
16:28:27:
16:28:27:  <!-- Slot Control -->
16:28:27:  <pause-on-battery v='false'/>
16:28:27:  <power v='full'/>
16:28:27:
16:28:27:  <!-- User Information -->
16:28:27:  <passkey v='*****'/>
16:28:27:  <team v='234980'/>
16:28:27:  <user v='uc28jql1jx9q'/>
16:28:27:
16:28:27:  <!-- Folding Slots -->
16:28:27:  <slot id='0' type='CPU'>
16:28:27:    <idle v='true'/>
16:28:27:  </slot>
16:28:27:  <slot id='1' type='GPU'>
16:28:27:    <pci-bus v='40'/>
16:28:27:    <pci-slot v='0'/>
16:28:27:  </slot>
16:28:27:</config>
16:28:27:Trying to access database...
16:28:27:Successfully acquired database lock
16:28:27:FS00:Initialized folding slot 00: cpu:11
16:28:27:FS01:Initialized folding slot 01: gpu:40:0 Navi 10 [Radeon RX 5600 OEM/5600 XT/5700/5700 XT]
16:28:27:WU01:FS01:Starting
16:28:27:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/win/64bit/22-0.0.13/Core_22.fah/FahCore_22.exe -dir 01 -suffix 01 -version 706 -lifeline 9584 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -gpu-vendor amd -gpu 0 -gpu-usage 100
16:28:27:WU01:FS01:Started FahCore on PID 9856
16:28:27:WU01:FS01:Core PID:9880
16:28:27:WU01:FS01:FahCore 0x22 started
16:28:28:WU01:FS01:0x22:*********************** Log Started 2021-05-03T16:28:27Z ***********************
16:28:28:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
16:28:28:WU01:FS01:0x22:       Core: Core22
16:28:28:WU01:FS01:0x22:       Type: 0x22
16:28:28:WU01:FS01:0x22:    Version: 0.0.13
16:28:28:WU01:FS01:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
16:28:28:WU01:FS01:0x22:  Copyright: 2020 foldingathome.org
16:28:28:WU01:FS01:0x22:   Homepage: https://foldingathome.org/
16:28:28:WU01:FS01:0x22:       Date: Sep 19 2020
16:28:28:WU01:FS01:0x22:       Time: 02:35:58
16:28:28:WU01:FS01:0x22:   Revision: 571cf95de6de2c592c7c3ed48fcfb2e33e9ea7d3
16:28:28:WU01:FS01:0x22:     Branch: core22-0.0.13
16:28:28:WU01:FS01:0x22:   Compiler: Visual C++ 2015
16:28:28:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
16:28:28:WU01:FS01:0x22:             -DOPENMM_GIT_HASH="\"189320d0\""
16:28:28:WU01:FS01:0x22:   Platform: win32 10
16:28:28:WU01:FS01:0x22:       Bits: 64
16:28:28:WU01:FS01:0x22:       Mode: Release
16:28:28:WU01:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
16:28:28:WU01:FS01:0x22:             <peastman@stanford.edu>
16:28:28:WU01:FS01:0x22:       Args: -dir 01 -suffix 01 -version 706 -lifeline 9856 -checkpoint 15
16:28:28:WU01:FS01:0x22:             -opencl-platform 0 -opencl-device 0 -gpu-vendor amd -gpu 0
16:28:28:WU01:FS01:0x22:             -gpu-usage 100
16:28:28:WU01:FS01:0x22:************************************ libFAH ************************************
16:28:28:WU01:FS01:0x22:       Date: Sep 7 2020
16:28:28:WU01:FS01:0x22:       Time: 19:09:56
16:28:28:WU01:FS01:0x22:   Revision: 44301ed97b996b63fe736bb8073f22209cb2b603
16:28:28:WU01:FS01:0x22:     Branch: HEAD
16:28:28:WU01:FS01:0x22:   Compiler: Visual C++ 2015
16:28:28:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
16:28:28:WU01:FS01:0x22:   Platform: win32 10
16:28:28:WU01:FS01:0x22:       Bits: 64
16:28:28:WU01:FS01:0x22:       Mode: Release
16:28:28:WU01:FS01:0x22:************************************ CBang *************************************
16:28:28:WU01:FS01:0x22:       Date: Sep 7 2020
16:28:28:WU01:FS01:0x22:       Time: 19:08:30
16:28:28:WU01:FS01:0x22:   Revision: 33fcfc2b3ed2195a423606a264718e31e6b3903f
16:28:28:WU01:FS01:0x22:     Branch: HEAD
16:28:28:WU01:FS01:0x22:   Compiler: Visual C++ 2015
16:28:28:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
16:28:28:WU01:FS01:0x22:   Platform: win32 10
16:28:28:WU01:FS01:0x22:       Bits: 64
16:28:28:WU01:FS01:0x22:       Mode: Release
16:28:28:WU01:FS01:0x22:************************************ System ************************************
16:28:28:WU01:FS01:0x22:        CPU: AMD Ryzen 5 3600 6-Core Processor
16:28:28:WU01:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
16:28:28:WU01:FS01:0x22:       CPUs: 12
16:28:28:WU01:FS01:0x22:     Memory: 15.93GiB
16:28:28:WU01:FS01:0x22:Free Memory: 13.66GiB
16:28:28:WU01:FS01:0x22:    Threads: WINDOWS_THREADS
16:28:28:WU01:FS01:0x22: OS Version: 6.2
16:28:28:WU01:FS01:0x22:Has Battery: false
16:28:28:WU01:FS01:0x22: On Battery: false
16:28:28:WU01:FS01:0x22: UTC Offset: 2
16:28:28:WU01:FS01:0x22:        PID: 9880
16:28:28:WU01:FS01:0x22:        CWD: C:\ProgramData\FAHClient\work
16:28:28:WU01:FS01:0x22:************************************ OpenMM ************************************
16:28:28:WU01:FS01:0x22:   Revision: 189320d0
16:28:28:WU01:FS01:0x22:********************************************************************************
16:28:28:WU01:FS01:0x22:Project: 16607 (Run 80, Clone 3, Gen 26)
16:28:28:WU01:FS01:0x22:Unit: 0x0000002a8f59f36f606d29b1dd6f79f3
16:28:28:WU01:FS01:0x22:Digital signatures verified
16:28:28:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
16:28:28:WU01:FS01:0x22:Version 0.0.13
16:28:28:WU01:FS01:0x22:  Checkpoint write interval: 50000 steps (2%) [50 total]
16:28:28:WU01:FS01:0x22:  JSON viewer frame write interval: 25000 steps (1%) [100 total]
16:28:28:WU01:FS01:0x22:  XTC frame write interval: 10000 steps (0.4%) [250 total]
16:28:28:WU01:FS01:0x22:  Global context and integrator variables write interval: disabled
16:28:28:WU01:FS01:0x22:There are 3 platforms available.
16:28:28:WU01:FS01:0x22:Platform 0: Reference
16:28:28:WU01:FS01:0x22:Platform 1: CPU
16:28:28:WU01:FS01:0x22:Platform 2: OpenCL
16:28:28:WU01:FS01:0x22:  opencl-device 0 specified
16:28:39:WU01:FS01:0x22:Attempting to create OpenCL context:
16:28:39:WU01:FS01:0x22:  Configuring platform OpenCL
16:29:02:WU01:FS01:0x22:  Using OpenCL on platformId 0 and gpu 0
16:29:02:WU01:FS01:0x22:Completed 1350000 out of 2500000 steps (54%)
16:29:17:WU01:FS01:0x22:An exception occurred at step 1350128: Particle coordinate is nan
16:29:17:WU01:FS01:0x22:ERROR:98: Attempting to restart from last good checkpoint by restarting core.
16:29:17:WU01:FS01:0x22:Folding@home Core Shutdown: CORE_RESTART
16:29:18:WARNING:WU01:FS01:FahCore returned: CORE_RESTART (98 = 0x62)
16:29:18:WU01:FS01:Starting
16:29:18:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/win/64bit/22-0.0.13/Core_22.fah/FahCore_22.exe -dir 01 -suffix 01 -version 706 -lifeline 9584 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -gpu-vendor amd -gpu 0 -gpu-usage 100
16:29:18:WU01:FS01:Started FahCore on PID 10016
16:29:18:WU01:FS01:Core PID:8168
16:29:18:WU01:FS01:FahCore 0x22 started
16:29:19:WU01:FS01:0x22:*********************** Log Started 2021-05-03T16:29:18Z ***********************
16:29:19:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
16:29:19:WU01:FS01:0x22:       Core: Core22
16:29:19:WU01:FS01:0x22:       Type: 0x22
16:29:19:WU01:FS01:0x22:    Version: 0.0.13
16:29:19:WU01:FS01:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
16:29:19:WU01:FS01:0x22:  Copyright: 2020 foldingathome.org
16:29:19:WU01:FS01:0x22:   Homepage: https://foldingathome.org/
16:29:19:WU01:FS01:0x22:       Date: Sep 19 2020
16:29:19:WU01:FS01:0x22:       Time: 02:35:58
16:29:19:WU01:FS01:0x22:   Revision: 571cf95de6de2c592c7c3ed48fcfb2e33e9ea7d3
16:29:19:WU01:FS01:0x22:     Branch: core22-0.0.13
16:29:19:WU01:FS01:0x22:   Compiler: Visual C++ 2015
16:29:19:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
16:29:19:WU01:FS01:0x22:             -DOPENMM_GIT_HASH="\"189320d0\""
16:29:19:WU01:FS01:0x22:   Platform: win32 10
16:29:19:WU01:FS01:0x22:       Bits: 64
16:29:19:WU01:FS01:0x22:       Mode: Release
16:29:19:WU01:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
16:29:19:WU01:FS01:0x22:             <peastman@stanford.edu>
16:29:19:WU01:FS01:0x22:       Args: -dir 01 -suffix 01 -version 706 -lifeline 10016 -checkpoint 15
16:29:19:WU01:FS01:0x22:             -opencl-platform 0 -opencl-device 0 -gpu-vendor amd -gpu 0
16:29:19:WU01:FS01:0x22:             -gpu-usage 100
16:29:19:WU01:FS01:0x22:************************************ libFAH ************************************
16:29:19:WU01:FS01:0x22:       Date: Sep 7 2020
16:29:19:WU01:FS01:0x22:       Time: 19:09:56
16:29:19:WU01:FS01:0x22:   Revision: 44301ed97b996b63fe736bb8073f22209cb2b603
16:29:19:WU01:FS01:0x22:     Branch: HEAD
16:29:19:WU01:FS01:0x22:   Compiler: Visual C++ 2015
16:29:19:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
16:29:19:WU01:FS01:0x22:   Platform: win32 10
16:29:19:WU01:FS01:0x22:       Bits: 64
16:29:19:WU01:FS01:0x22:       Mode: Release
16:29:19:WU01:FS01:0x22:************************************ CBang *************************************
16:29:19:WU01:FS01:0x22:       Date: Sep 7 2020
16:29:19:WU01:FS01:0x22:       Time: 19:08:30
16:29:19:WU01:FS01:0x22:   Revision: 33fcfc2b3ed2195a423606a264718e31e6b3903f
16:29:19:WU01:FS01:0x22:     Branch: HEAD
16:29:19:WU01:FS01:0x22:   Compiler: Visual C++ 2015
16:29:19:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
16:29:19:WU01:FS01:0x22:   Platform: win32 10
16:29:19:WU01:FS01:0x22:       Bits: 64
16:29:19:WU01:FS01:0x22:       Mode: Release
16:29:19:WU01:FS01:0x22:************************************ System ************************************
16:29:19:WU01:FS01:0x22:        CPU: AMD Ryzen 5 3600 6-Core Processor
16:29:19:WU01:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
16:29:19:WU01:FS01:0x22:       CPUs: 12
16:29:19:WU01:FS01:0x22:     Memory: 15.93GiB
16:29:19:WU01:FS01:0x22:Free Memory: 13.19GiB
16:29:19:WU01:FS01:0x22:    Threads: WINDOWS_THREADS
16:29:19:WU01:FS01:0x22: OS Version: 6.2
16:29:19:WU01:FS01:0x22:Has Battery: false
16:29:19:WU01:FS01:0x22: On Battery: false
16:29:19:WU01:FS01:0x22: UTC Offset: 2
16:29:19:WU01:FS01:0x22:        PID: 8168
16:29:19:WU01:FS01:0x22:        CWD: C:\ProgramData\FAHClient\work
16:29:19:WU01:FS01:0x22:************************************ OpenMM ************************************
16:29:19:WU01:FS01:0x22:   Revision: 189320d0
16:29:19:WU01:FS01:0x22:********************************************************************************
16:29:19:WU01:FS01:0x22:Project: 16607 (Run 80, Clone 3, Gen 26)
16:29:19:WU01:FS01:0x22:Unit: 0x0000002a8f59f36f606d29b1dd6f79f3
16:29:19:WU01:FS01:0x22:Digital signatures verified
16:29:19:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
16:29:19:WU01:FS01:0x22:Version 0.0.13
16:29:19:WU01:FS01:0x22:  Checkpoint write interval: 50000 steps (2%) [50 total]
16:29:19:WU01:FS01:0x22:  JSON viewer frame write interval: 25000 steps (1%) [100 total]
16:29:19:WU01:FS01:0x22:  XTC frame write interval: 10000 steps (0.4%) [250 total]
16:29:19:WU01:FS01:0x22:  Global context and integrator variables write interval: disabled
16:29:19:WU01:FS01:0x22:There are 3 platforms available.
16:29:19:WU01:FS01:0x22:Platform 0: Reference
16:29:19:WU01:FS01:0x22:Platform 1: CPU
16:29:19:WU01:FS01:0x22:Platform 2: OpenCL
16:29:19:WU01:FS01:0x22:  opencl-device 0 specified
16:29:29:WU01:FS01:0x22:Attempting to create OpenCL context:
16:29:29:WU01:FS01:0x22:  Configuring platform OpenCL
16:29:30:ERROR:Receive error: 10053: Eine bestehende Verbindung wurde softwaregesteuert
16:29:30:ERROR:durch den Hostcomputer abgebrochen.
16:29:51:WU01:FS01:0x22:  Using OpenCL on platformId 0 and gpu 0
16:29:51:WU01:FS01:0x22:Completed 1350000 out of 2500000 steps (54%)
16:34:31:WU01:FS01:0x22:Completed 1375000 out of 2500000 steps (55%)
16:39:06:WU01:FS01:0x22:Completed 1400000 out of 2500000 steps (56%)
16:39:08:WU01:FS01:0x22:Checkpoint completed at step 1400000
16:43:44:WU01:FS01:0x22:Completed 1425000 out of 2500000 steps (57%)
16:48:21:WU01:FS01:0x22:Completed 1450000 out of 2500000 steps (58%)
16:48:22:WU01:FS01:0x22:Checkpoint completed at step 1450000
16:52:59:WU01:FS01:0x22:Completed 1475000 out of 2500000 steps (59%)
16:57:36:WU01:FS01:0x22:Completed 1500000 out of 2500000 steps (60%)
16:57:37:WU01:FS01:0x22:Checkpoint completed at step 1500000
17:00:22:WU01:FS01:0x22:Completed 1525000 out of 2500000 steps (61%)
17:00:27:WU01:FS01:0x22:Completed 1550000 out of 2500000 steps (62%)
17:00:27:WU01:FS01:0x22:An exception occurred at step 1550000: Discrepancy: Velocities are blowing up! 0 0 443381
17:00:27:WU01:FS01:0x22:ERROR:98: Attempting to restart from last good checkpoint by restarting core.
17:00:27:WU01:FS01:0x22:Folding@home Core Shutdown: CORE_RESTART
17:00:27:WARNING:WU01:FS01:FahCore returned: CORE_RESTART (98 = 0x62)
17:00:27:WU01:FS01:Starting
17:00:27:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/win/64bit/22-0.0.13/Core_22.fah/FahCore_22.exe -dir 01 -suffix 01 -version 706 -lifeline 9584 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -gpu-vendor amd -gpu 0 -gpu-usage 100
17:00:27:WU01:FS01:Started FahCore on PID 5840
17:00:27:WU01:FS01:Core PID:6080
17:00:27:WU01:FS01:FahCore 0x22 started
17:00:28:WU01:FS01:0x22:*********************** Log Started 2021-05-03T17:00:27Z ***********************
17:00:28:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
17:00:28:WU01:FS01:0x22:       Core: Core22
17:00:28:WU01:FS01:0x22:       Type: 0x22
17:00:28:WU01:FS01:0x22:    Version: 0.0.13
17:00:28:WU01:FS01:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
17:00:28:WU01:FS01:0x22:  Copyright: 2020 foldingathome.org
17:00:28:WU01:FS01:0x22:   Homepage: https://foldingathome.org/
17:00:28:WU01:FS01:0x22:       Date: Sep 19 2020
17:00:28:WU01:FS01:0x22:       Time: 02:35:58
17:00:28:WU01:FS01:0x22:   Revision: 571cf95de6de2c592c7c3ed48fcfb2e33e9ea7d3
17:00:28:WU01:FS01:0x22:     Branch: core22-0.0.13
17:00:28:WU01:FS01:0x22:   Compiler: Visual C++ 2015
17:00:28:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
17:00:28:WU01:FS01:0x22:             -DOPENMM_GIT_HASH="\"189320d0\""
17:00:28:WU01:FS01:0x22:   Platform: win32 10
17:00:28:WU01:FS01:0x22:       Bits: 64
17:00:28:WU01:FS01:0x22:       Mode: Release
17:00:28:WU01:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
17:00:28:WU01:FS01:0x22:             <peastman@stanford.edu>
17:00:28:WU01:FS01:0x22:       Args: -dir 01 -suffix 01 -version 706 -lifeline 5840 -checkpoint 15
17:00:28:WU01:FS01:0x22:             -opencl-platform 0 -opencl-device 0 -gpu-vendor amd -gpu 0
17:00:28:WU01:FS01:0x22:             -gpu-usage 100
17:00:28:WU01:FS01:0x22:************************************ libFAH ************************************
17:00:28:WU01:FS01:0x22:       Date: Sep 7 2020
17:00:28:WU01:FS01:0x22:       Time: 19:09:56
17:00:28:WU01:FS01:0x22:   Revision: 44301ed97b996b63fe736bb8073f22209cb2b603
17:00:28:WU01:FS01:0x22:     Branch: HEAD
17:00:28:WU01:FS01:0x22:   Compiler: Visual C++ 2015
17:00:28:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
17:00:28:WU01:FS01:0x22:   Platform: win32 10
17:00:28:WU01:FS01:0x22:       Bits: 64
17:00:28:WU01:FS01:0x22:       Mode: Release
17:00:28:WU01:FS01:0x22:************************************ CBang *************************************
17:00:28:WU01:FS01:0x22:       Date: Sep 7 2020
17:00:28:WU01:FS01:0x22:       Time: 19:08:30
17:00:28:WU01:FS01:0x22:   Revision: 33fcfc2b3ed2195a423606a264718e31e6b3903f
17:00:28:WU01:FS01:0x22:     Branch: HEAD
17:00:28:WU01:FS01:0x22:   Compiler: Visual C++ 2015
17:00:28:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
17:00:28:WU01:FS01:0x22:   Platform: win32 10
17:00:28:WU01:FS01:0x22:       Bits: 64
17:00:28:WU01:FS01:0x22:       Mode: Release
17:00:28:WU01:FS01:0x22:************************************ System ************************************
17:00:28:WU01:FS01:0x22:        CPU: AMD Ryzen 5 3600 6-Core Processor
17:00:28:WU01:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
17:00:28:WU01:FS01:0x22:       CPUs: 12
17:00:28:WU01:FS01:0x22:     Memory: 15.93GiB
17:00:28:WU01:FS01:0x22:Free Memory: 12.66GiB
17:00:28:WU01:FS01:0x22:    Threads: WINDOWS_THREADS
17:00:28:WU01:FS01:0x22: OS Version: 6.2
17:00:28:WU01:FS01:0x22:Has Battery: false
17:00:28:WU01:FS01:0x22: On Battery: false
17:00:28:WU01:FS01:0x22: UTC Offset: 2
17:00:28:WU01:FS01:0x22:        PID: 6080
17:00:28:WU01:FS01:0x22:        CWD: C:\ProgramData\FAHClient\work
17:00:28:WU01:FS01:0x22:************************************ OpenMM ************************************
17:00:28:WU01:FS01:0x22:   Revision: 189320d0
17:00:28:WU01:FS01:0x22:********************************************************************************
17:00:28:WU01:FS01:0x22:Project: 16607 (Run 80, Clone 3, Gen 26)
17:00:28:WU01:FS01:0x22:Unit: 0x0000002a8f59f36f606d29b1dd6f79f3
17:00:28:WU01:FS01:0x22:Digital signatures verified
17:00:28:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
17:00:28:WU01:FS01:0x22:Version 0.0.13
17:00:28:WU01:FS01:0x22:  Checkpoint write interval: 50000 steps (2%) [50 total]
17:00:28:WU01:FS01:0x22:  JSON viewer frame write interval: 25000 steps (1%) [100 total]
17:00:28:WU01:FS01:0x22:  XTC frame write interval: 10000 steps (0.4%) [250 total]
17:00:28:WU01:FS01:0x22:  Global context and integrator variables write interval: disabled
17:00:28:WU01:FS01:0x22:There are 3 platforms available.
17:00:28:WU01:FS01:0x22:Platform 0: Reference
17:00:28:WU01:FS01:0x22:Platform 1: CPU
17:00:28:WU01:FS01:0x22:Platform 2: OpenCL
17:00:28:WU01:FS01:0x22:  opencl-device 0 specified
17:00:38:WU01:FS01:0x22:Attempting to create OpenCL context:
17:00:38:WU01:FS01:0x22:  Configuring platform OpenCL
17:01:00:WU01:FS01:0x22:  Using OpenCL on platformId 0 and gpu 0
17:01:00:WU01:FS01:0x22:Completed 1500000 out of 2500000 steps (60%)
17:01:16:WU01:FS01:0x22:An exception occurred at step 1500226: Particle coordinate is nan
17:01:16:WU01:FS01:0x22:Max number of attempts to resume from last checkpoint (2) reached. Aborting.
17:01:16:WU01:FS01:0x22:ERROR:114: Max number of attempts to resume from last checkpoint reached.
17:01:16:WU01:FS01:0x22:Saving result file ..\logfile_01.txt
17:01:16:WU01:FS01:0x22:Saving result file science.log
17:01:16:WU01:FS01:0x22:Saving result file state.xml
17:01:19:WU01:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
17:01:19:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
17:01:19:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:16607 run:80 clone:3 gen:26 core:0x22 unit:0x0000002a8f59f36f606d29b1dd6f79f3
17:01:19:WU01:FS01:Uploading 18.15MiB to 143.89.243.111
17:01:19:WU01:FS01:Connecting to 143.89.243.111:8080
17:01:20:WU00:FS01:Connecting to assign1.foldingathome.org:80
17:01:20:WU00:FS01:Assigned to work server 140.163.4.200
17:01:20:WU00:FS01:Requesting new work unit for slot 01: gpu:40:0 Navi 10 [Radeon RX 5600 OEM/5600 XT/5700/5700 XT] from 140.163.4.200
17:01:20:WU00:FS01:Connecting to 140.163.4.200:8080
17:01:21:WU00:FS01:Downloading 21.69MiB
17:01:25:WU01:FS01:Upload 4.13%
17:01:27:WU00:FS01:Download 15.27%
17:01:31:WU01:FS01:Upload 10.33%
17:01:33:WU00:FS01:Download 31.12%
17:01:37:WU01:FS01:Upload 16.53%
17:01:39:WU00:FS01:Download 46.96%
17:01:43:WU01:FS01:Upload 24.45%
17:01:45:WU00:FS01:Download 62.23%
17:01:49:WU01:FS01:Upload 32.71%
17:01:51:WU00:FS01:Download 78.08%
17:01:55:WU01:FS01:Upload 40.29%
17:01:57:WU00:FS01:Download 93.64%
17:01:59:WU00:FS01:Download complete
17:02:00:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:17346 run:1 clone:547 gen:19 core:0x22 unit:0x0000022300000013000043c200000001
17:02:00:WU00:FS01:Starting
17:02:00:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/win/64bit/22-0.0.13/Core_22.fah/FahCore_22.exe -dir 00 -suffix 01 -version 706 -lifeline 9584 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -gpu-vendor amd -gpu 0 -gpu-usage 100
17:02:00:WU00:FS01:Started FahCore on PID 3792
17:02:00:WU00:FS01:Core PID:5748
17:02:00:WU00:FS01:FahCore 0x22 started
17:02:00:WU00:FS01:0x22:*********************** Log Started 2021-05-03T17:02:00Z ***********************
17:02:00:WU00:FS01:0x22:*************************** Core22 Folding@home Core ***************************
17:02:00:WU00:FS01:0x22:       Core: Core22
17:02:00:WU00:FS01:0x22:       Type: 0x22
17:02:00:WU00:FS01:0x22:    Version: 0.0.13
17:02:00:WU00:FS01:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
17:02:00:WU00:FS01:0x22:  Copyright: 2020 foldingathome.org
17:02:00:WU00:FS01:0x22:   Homepage: https://foldingathome.org/
17:02:00:WU00:FS01:0x22:       Date: Sep 19 2020
17:02:00:WU00:FS01:0x22:       Time: 02:35:58
17:02:00:WU00:FS01:0x22:   Revision: 571cf95de6de2c592c7c3ed48fcfb2e33e9ea7d3
17:02:00:WU00:FS01:0x22:     Branch: core22-0.0.13
17:02:00:WU00:FS01:0x22:   Compiler: Visual C++ 2015
17:02:00:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
17:02:00:WU00:FS01:0x22:             -DOPENMM_GIT_HASH="\"189320d0\""
17:02:00:WU00:FS01:0x22:   Platform: win32 10
17:02:00:WU00:FS01:0x22:       Bits: 64
17:02:00:WU00:FS01:0x22:       Mode: Release
17:02:00:WU00:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
17:02:00:WU00:FS01:0x22:             <peastman@stanford.edu>
17:02:00:WU00:FS01:0x22:       Args: -dir 00 -suffix 01 -version 706 -lifeline 3792 -checkpoint 15
17:02:00:WU00:FS01:0x22:             -opencl-platform 0 -opencl-device 0 -gpu-vendor amd -gpu 0
17:02:00:WU00:FS01:0x22:             -gpu-usage 100
17:02:00:WU00:FS01:0x22:************************************ libFAH ************************************
17:02:00:WU00:FS01:0x22:       Date: Sep 7 2020
17:02:00:WU00:FS01:0x22:       Time: 19:09:56
17:02:00:WU00:FS01:0x22:   Revision: 44301ed97b996b63fe736bb8073f22209cb2b603
17:02:00:WU00:FS01:0x22:     Branch: HEAD
17:02:00:WU00:FS01:0x22:   Compiler: Visual C++ 2015
17:02:00:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
17:02:00:WU00:FS01:0x22:   Platform: win32 10
17:02:00:WU00:FS01:0x22:       Bits: 64
17:02:00:WU00:FS01:0x22:       Mode: Release
17:02:00:WU00:FS01:0x22:************************************ CBang *************************************
17:02:00:WU00:FS01:0x22:       Date: Sep 7 2020
17:02:00:WU00:FS01:0x22:       Time: 19:08:30
17:02:00:WU00:FS01:0x22:   Revision: 33fcfc2b3ed2195a423606a264718e31e6b3903f
17:02:00:WU00:FS01:0x22:     Branch: HEAD
17:02:00:WU00:FS01:0x22:   Compiler: Visual C++ 2015
17:02:00:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
17:02:00:WU00:FS01:0x22:   Platform: win32 10
17:02:00:WU00:FS01:0x22:       Bits: 64
17:02:00:WU00:FS01:0x22:       Mode: Release
17:02:00:WU00:FS01:0x22:************************************ System ************************************
17:02:00:WU00:FS01:0x22:        CPU: AMD Ryzen 5 3600 6-Core Processor
17:02:00:WU00:FS01:0x22:     CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0
17:02:00:WU00:FS01:0x22:       CPUs: 12
17:02:00:WU00:FS01:0x22:     Memory: 15.93GiB
17:02:00:WU00:FS01:0x22:Free Memory: 12.25GiB
17:02:00:WU00:FS01:0x22:    Threads: WINDOWS_THREADS
17:02:00:WU00:FS01:0x22: OS Version: 6.2
17:02:00:WU00:FS01:0x22:Has Battery: false
17:02:00:WU00:FS01:0x22: On Battery: false
17:02:00:WU00:FS01:0x22: UTC Offset: 2
17:02:00:WU00:FS01:0x22:        PID: 5748
17:02:00:WU00:FS01:0x22:        CWD: C:\ProgramData\FAHClient\work
17:02:00:WU00:FS01:0x22:************************************ OpenMM ************************************
17:02:00:WU00:FS01:0x22:   Revision: 189320d0
17:02:00:WU00:FS01:0x22:********************************************************************************
17:02:00:WU00:FS01:0x22:Project: 17346 (Run 1, Clone 547, Gen 19)
17:02:00:WU00:FS01:0x22:Unit: 0x00000000000000000000000000000000
17:02:00:WU00:FS01:0x22:Reading tar file core.xml
17:02:00:WU00:FS01:0x22:Reading tar file integrator.xml.bz2
17:02:00:WU00:FS01:0x22:Reading tar file state.xml.bz2
17:02:00:WU00:FS01:0x22:Reading tar file system.xml.bz2
17:02:00:WU00:FS01:0x22:Digital signatures verified
17:02:00:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
17:02:00:WU00:FS01:0x22:Version 0.0.13
17:02:00:WU00:FS01:0x22:  Checkpoint write interval: 15000 steps (2%) [50 total]
17:02:00:WU00:FS01:0x22:  JSON viewer frame write interval: 7500 steps (1%) [100 total]
17:02:00:WU00:FS01:0x22:  XTC frame write interval: 250000 steps (33%) [3 total]
17:02:00:WU00:FS01:0x22:  Global context and integrator variables write interval: disabled
17:02:00:WU00:FS01:0x22:There are 3 platforms available.
17:02:00:WU00:FS01:0x22:Platform 0: Reference
17:02:00:WU00:FS01:0x22:Platform 1: CPU
17:02:00:WU00:FS01:0x22:Platform 2: OpenCL
17:02:00:WU00:FS01:0x22:  opencl-device 0 specified
17:02:01:WU01:FS01:Upload 48.89%
17:02:07:WU01:FS01:Upload 56.81%
17:02:13:WU01:FS01:Upload 64.39%
17:02:15:WU00:FS01:0x22:Attempting to create OpenCL context:
17:02:15:WU00:FS01:0x22:  Configuring platform OpenCL
17:02:19:WU01:FS01:Upload 71.62%
17:02:25:WU01:FS01:Upload 77.82%
17:02:31:WU01:FS01:Upload 86.77%
17:02:37:WU01:FS01:Upload 96.41%
17:02:41:WU00:FS01:0x22:  Using OpenCL on platformId 0 and gpu 0
17:02:41:WU00:FS01:0x22:Completed 0 out of 750000 steps (0%)
17:02:41:WU01:FS01:Upload complete
17:02:41:WU01:FS01:Server responded WORK_ACK (400)
17:02:41:WU01:FS01:Cleaning up
17:02:42:WU00:FS01:0x22:Checkpoint completed at step 0
17:05:09:WU00:FS01:0x22:Completed 7500 out of 750000 steps (1%)
17:07:34:WU00:FS01:0x22:Completed 15000 out of 750000 steps (2%)
17:07:37:WU00:FS01:0x22:Checkpoint completed at step 15000
17:10:03:WU00:FS01:0x22:Completed 22500 out of 750000 steps (3%)
17:12:29:WU00:FS01:0x22:Completed 30000 out of 750000 steps (4%)
17:12:31:WU00:FS01:0x22:Checkpoint completed at step 30000
17:14:58:WU00:FS01:0x22:Completed 37500 out of 750000 steps (5%)
17:17:24:WU00:FS01:0x22:Completed 45000 out of 750000 steps (6%)
17:17:26:WU00:FS01:0x22:Checkpoint completed at step 45000
17:19:53:WU00:FS01:0x22:Completed 52500 out of 750000 steps (7%)
17:22:18:WU00:FS01:0x22:Completed 60000 out of 750000 steps (8%)
17:22:21:WU00:FS01:0x22:Checkpoint completed at step 60000
17:24:47:WU00:FS01:0x22:Completed 67500 out of 750000 steps (9%)
17:27:13:WU00:FS01:0x22:Completed 75000 out of 750000 steps (10%)
17:27:15:WU00:FS01:0x22:Checkpoint completed at step 75000
17:29:41:WU00:FS01:0x22:Completed 82500 out of 750000 steps (11%)
17:32:08:WU00:FS01:0x22:Completed 90000 out of 750000 steps (12%)
17:32:11:WU00:FS01:0x22:Checkpoint completed at step 90000
17:34:38:WU00:FS01:0x22:Completed 97500 out of 750000 steps (13%)
17:37:05:WU00:FS01:0x22:Completed 105000 out of 750000 steps (14%)
17:37:07:WU00:FS01:0x22:Checkpoint completed at step 105000
17:39:34:WU00:FS01:0x22:Completed 112500 out of 750000 steps (15%)
17:42:01:WU00:FS01:0x22:Completed 120000 out of 750000 steps (16%)
17:42:03:WU00:FS01:0x22:Checkpoint completed at step 120000
gunnarre
Posts: 567
Joined: Sun May 24, 2020 7:23 pm
Location: Norway

Re: Losing a WU unit

Post by gunnarre »

The WU is faulty. Your hardware is likely OK.

It fails first at point 1550000 and then at point 1500226. If it fails twice at exactly the same point, then the WU is likely faulty, but just from the log I wasn't sure since it doesn't fail at the exact same point with the same type of failure. After looking up the WU at https://apps.foldingathome.org/wu#proje ... e=3&gen=26 and seeing that nobody was able to fold it successfully, we can be pretty sure that the WU is to blame. It has been tried on both AMD and Nvidia GPUs and came back faulty
Image
Online: GTX 1660 Super, GTX 1080, GTX 1050 Ti 4G OC, RX580 + occasional CPU folding in the cold.
Offline: Radeon HD 7770, GTX 960, GTX 950
Post Reply