Project: 18201 (Run 24975, Clone 0, Gen 21)

Moderators: Site Moderators, FAHC Science Team

v00d00
Posts: 396
Joined: Sun Dec 02, 2007 4:53 am
Hardware configuration: FX8320e (6 cores enabled) @ stock,
- 16GB DDR3,
- Zotac GTX 1050Ti @ Stock.
- Gigabyte GTX 970 @ Stock
Debian 9.

Running GPU since it came out, CPU since client version 3.
Folding since Folding began (~2000) and ran Genome@Home for a while too.
Ran Seti@Home prior to that.
Location: UK
Contact:

Project: 18201 (Run 24975, Clone 0, Gen 21)

Post by v00d00 »

An error on an 18201. Seems to be CUDA related.

Code: Select all

19:06:02:WU01:FS00:0x22:Project: 18201 (Run 24975, Clone 0, Gen 21)
19:06:02:WU01:FS00:0x22:Unit: 0x00000000000000000000000000000000
19:06:02:WU01:FS00:0x22:Reading tar file core.xml
19:06:02:WU01:FS00:0x22:Reading tar file integrator.xml
19:06:02:WU01:FS00:0x22:Reading tar file state.xml
19:06:02:WU01:FS00:0x22:Reading tar file system.xml
19:06:03:WU01:FS00:0x22:Digital signatures verified
19:06:03:WU01:FS00:0x22:Folding@home GPU Core22 Folding@home Core
19:06:03:WU01:FS00:0x22:Version 0.0.18
19:06:03:WU01:FS00:0x22:  Checkpoint write interval: 25000 steps (2%) [50 total]
19:06:03:WU01:FS00:0x22:  JSON viewer frame write interval: 12500 steps (1%) [100 total]
19:06:03:WU01:FS00:0x22:  XTC frame write interval: 20000 steps (1.6%) [62 total]
19:06:03:WU01:FS00:0x22:  Global context and integrator variables write interval: disabled
19:06:03:WU01:FS00:0x22:There are 4 platforms available.
19:06:03:WU01:FS00:0x22:Platform 0: Reference
19:06:03:WU01:FS00:0x22:Platform 1: CPU
19:06:03:WU01:FS00:0x22:Platform 2: OpenCL
19:06:03:WU01:FS00:0x22:  opencl-device 0 specified
19:06:03:WU01:FS00:0x22:Platform 3: CUDA
19:06:03:WU01:FS00:0x22:  cuda-device 0 specified
19:06:07:WU00:FS00:Upload 17.18%
19:06:13:WU00:FS00:Upload 40.85%
19:06:18:WU01:FS00:0x22:Attempting to create CUDA context:
19:06:18:WU01:FS00:0x22:  Configuring platform CUDA
19:06:19:WU00:FS00:Upload 64.06%
19:06:25:WU00:FS00:Upload 86.34%
19:06:26:WU01:FS00:0x22:  Using CUDA and gpu 0
19:06:26:WU01:FS00:0x22:Completed 0 out of 1250000 steps (0%)
19:06:28:WU01:FS00:0x22:Checkpoint completed at step 0
19:06:29:WU00:FS00:Upload complete
19:06:29:WU00:FS00:Server responded WORK_ACK (400)
19:06:29:WU00:FS00:Final credit estimate, 402277.00 points
19:06:29:WU00:FS00:Cleaning up
19:07:39:WU01:FS00:0x22:Completed 12500 out of 1250000 steps (1%)
~
19:26:52:WU01:FS00:0x22:Completed 212500 out of 1250000 steps (17%)
19:27:39:WU01:FS00:0x22:An exception occurred at step 220887: Error invoking kernel: CUDA_ERROR_INVALID_PC (718)
19:27:39:WU01:FS00:0x22:ERROR:98: Attempting to restart from last good checkpoint by restarting core.
19:27:39:WU01:FS00:0x22:Folding@home Core Shutdown: CORE_RESTART
******************************* Date: 2021-11-29 *******************************
06:43:21:WARNING:WU01:FS00:FahCore returned an unknown error code which probably indicates that it crashed
06:43:21:WARNING:WU01:FS00:FahCore returned: UNKNOWN_ENUM (-1073740791 = 0xc0000409)
Image
toTOW
Site Moderator
Posts: 6296
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project: 18201 (Run 24975, Clone 0, Gen 21)

Post by toTOW »

I see the WU has been completed, but not by you ... I guess the WU never completed ?

It's the first time I see this error, so it might be a bit tricky to understand it ...

Is the error related to some system event in system logs ?
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
v00d00
Posts: 396
Joined: Sun Dec 02, 2007 4:53 am
Hardware configuration: FX8320e (6 cores enabled) @ stock,
- 16GB DDR3,
- Zotac GTX 1050Ti @ Stock.
- Gigabyte GTX 970 @ Stock
Debian 9.

Running GPU since it came out, CPU since client version 3.
Folding since Folding began (~2000) and ran Genome@Home for a while too.
Ran Seti@Home prior to that.
Location: UK
Contact:

Re: Project: 18201 (Run 24975, Clone 0, Gen 21)

Post by v00d00 »

It's more than likely related to the driver. i was using 441.x and noticed that CUDA wasn't initialising in the logs, so had a look around the board and noticed I needed to upgrade the driver to get CUDA to work again due to some update on the core side. So I put the latest driver on the system. It probably doesnt like it, although I've done many more workunits since without seeing this error again. If it comes up again I will try downgrading the driver and see if it improves things.
Image
toTOW
Site Moderator
Posts: 6296
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project: 18201 (Run 24975, Clone 0, Gen 21)

Post by toTOW »

The common thing in your two failures is that it occurred on two big WUs ... and the latest version of core 22 is pushing GPUs a little bit harder than the usual one.

I had to reduce overclocking on my 980 by 25 MHz to avoid instabilities (mostly NaNs).

Also, see this global announcement about software requirements of core 22 v0.0.18 : viewtopic.php?f=24&t=37391
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
v00d00
Posts: 396
Joined: Sun Dec 02, 2007 4:53 am
Hardware configuration: FX8320e (6 cores enabled) @ stock,
- 16GB DDR3,
- Zotac GTX 1050Ti @ Stock.
- Gigabyte GTX 970 @ Stock
Debian 9.

Running GPU since it came out, CPU since client version 3.
Folding since Folding began (~2000) and ran Genome@Home for a while too.
Ran Seti@Home prior to that.
Location: UK
Contact:

Re: Project: 18201 (Run 24975, Clone 0, Gen 21)

Post by v00d00 »

Thats the thread I read when I was troubleshooting the CUDA issue. Going up to 471.22 fixed the issue for now. The rest of the system is fine so far.

If it happens anymore I will try downclocking the card by a bit and see if that helps. Its at stock speeds at present and it doesnt run at any temp above around 65C.

I just had a quick look through the logs and it appears I have done several more of these workunits without issue, so no idea.
Image
Lazvon
Posts: 105
Joined: Wed Jan 05, 2022 1:06 am
Hardware configuration: 4080 / 12700F, 3090Ti/12900KS, 3090/12900K, 3090/10940X, 3080Ti/12700K, 3080Ti/9900X, 3080Ti/9900X

Re: Project: 18201 (Run 24975, Clone 0, Gen 21)

Post by Lazvon »

I am new here, but had to come see if someone had some advice. Project 18201 keeps locking my three machines up. I have a 3090 and two 3080Ti machines and over the last week or so, every one of them has essentially been stuck at 0.0% with this project for hours (or for the first one 2+ days as I waited for it to progress).

I have to delete the slot, and then command line —dump ALL and then hope I don’t get assigned another 18201 WU.

What am I doing wrong and why only this one project on all three machines.

Jay
Folding since Feb 2021. 1) 4090/12900KS, 2) 4080/12700F, 3) 4070Ti/9900X, 4) 3090/12900K, 5) 3090/10940X, 6) 3080Ti/12700K, 7) 3080Ti/9900X

Image
toTOW
Site Moderator
Posts: 6296
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project: 18201 (Run 24975, Clone 0, Gen 21)

Post by toTOW »

Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
aetch
Posts: 447
Joined: Thu Jun 25, 2020 3:04 pm
Location: Between chair and keyboard

Re: Project: 18201 (Run 24975, Clone 0, Gen 21)

Post by aetch »

Project 18201 is quite a busy project just now, in the 3-4 weeks since I last rebooted my folding machines I have had 150+ work units for that project.
It's not really a project which can be avoided.
We'd like to help you configure/repair your machines but we need information from yourself, namely a dump of the logs produced by the client.
This will give us the current configuration (Operating system, cpu, graphics cards, GPU driver version, folding slot configuration, etc) of your machines as well as, hopefully, any error messages thrown up by the client.

There are guides on how to locate your logs and post them here.
viewtopic.php?f=24&t=26036#p327412

We don't need to do all the machine at once, we can do one at a time.
Folding Rigs - None (25-Jun-2022)

ImageImage
Lazvon
Posts: 105
Joined: Wed Jan 05, 2022 1:06 am
Hardware configuration: 4080 / 12700F, 3090Ti/12900KS, 3090/12900K, 3090/10940X, 3080Ti/12700K, 3080Ti/9900X, 3080Ti/9900X

Re: Project: 18201 (Run 24975, Clone 0, Gen 21)

Post by Lazvon »

So, things had been fine for a few weeks and I forgot about this. But now, I come to one of the three machines and it is sitting there at 0.0% for a couple days now on 18201 yet again. This project does NOT like my computers or network or whatever. I'll delete the slot, shut it down DROP ALL and reboot and assume that it will be okay for a while.

I don't know how much you need to see, so I think the logs are 2-3 work units.

Code: Select all

.......
11:39:32:WU01:FS01:0x22:Checkpoint completed at step 490000
11:40:26:WU01:FS01:0x22:Completed 495000 out of 500000 steps (99%)
11:40:27:WU00:FS01:Connecting to assign1.foldingathome.org:80
11:40:27:WU00:FS01:Assigned to work server 128.252.203.11
11:40:27:WU00:FS01:Requesting new work unit for slot 01: gpu:1:0 GA102 [GeForce RTX 3080 Ti] from 128.252.203.11
11:40:27:WU00:FS01:Connecting to 128.252.203.11:8080
11:40:37:WU00:FS01:Downloading 26.50MiB
11:40:40:WU00:FS01:Download complete
11:40:40:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:18201 run:20158 clone:1 gen:5 core:0x22 unit:0x00000001000000050000471900004ebe
11:41:20:WU01:FS01:0x22:Completed 500000 out of 500000 steps (100%)
11:41:20:WU01:FS01:0x22:Average performance: 80 ns/day
11:41:22:WU01:FS01:0x22:Checkpoint completed at step 500000
11:41:42:WU01:FS01:0x22:Saving result file ..\logfile_01.txt
11:41:42:WU01:FS01:0x22:Saving result file checkpointIntegrator.xml
11:41:42:WU01:FS01:0x22:Saving result file checkpointState.xml.bz2
11:41:42:WU01:FS01:0x22:Saving result file positions.xtc
11:41:42:WU01:FS01:0x22:Saving result file science.log
11:41:42:WU01:FS01:0x22:Folding@home Core Shutdown: FINISHED_UNIT
11:41:44:WU01:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
11:41:44:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:17257 run:2465 clone:2 gen:12 core:0x22 unit:0x000000020000000c00004369000009a1
11:41:44:WU01:FS01:Uploading 51.40MiB to 128.252.203.10
11:41:44:WU00:FS01:Starting
11:41:44:WU01:FS01:Connecting to 128.252.203.10:8080
11:41:44:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/win/64bit/22-0.0.18/Core_22.fah/FahCore_22.exe -dir 00 -suffix 01 -version 706 -lifeline 5368 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
11:41:44:WU00:FS01:Started FahCore on PID 4152
11:41:44:WU00:FS01:Core PID:7172
11:41:44:WU00:FS01:FahCore 0x22 started
11:41:44:WU00:FS01:0x22:*********************** Log Started 2022-01-25T11:41:44Z ***********************
11:41:44:WU00:FS01:0x22:*************************** Core22 Folding@home Core ***************************
11:41:44:WU00:FS01:0x22:       Core: Core22
11:41:44:WU00:FS01:0x22:       Type: 0x22
11:41:44:WU00:FS01:0x22:    Version: 0.0.18
11:41:44:WU00:FS01:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
11:41:44:WU00:FS01:0x22:  Copyright: 2020 foldingathome.org
11:41:44:WU00:FS01:0x22:   Homepage: https://foldingathome.org/
11:41:44:WU00:FS01:0x22:       Date: Sep 28 2021
11:41:44:WU00:FS01:0x22:       Time: 05:55:05
11:41:44:WU00:FS01:0x22:   Revision: cfe3d7d990e8f456e371f8ce63b5fcc6daab2103
11:41:44:WU00:FS01:0x22:     Branch: HEAD
11:41:44:WU00:FS01:0x22:   Compiler: Visual C++
11:41:44:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
11:41:44:WU00:FS01:0x22:             -DOPENMM_VERSION="\"7.6.0\""
11:41:44:WU00:FS01:0x22:   Platform: win32 10
11:41:44:WU00:FS01:0x22:       Bits: 64
11:41:44:WU00:FS01:0x22:       Mode: Release
11:41:44:WU00:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
11:41:44:WU00:FS01:0x22:             <peastman@stanford.edu>
11:41:44:WU00:FS01:0x22:       Args: -dir 00 -suffix 01 -version 706 -lifeline 4152 -checkpoint 15
11:41:44:WU00:FS01:0x22:             -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor
11:41:44:WU00:FS01:0x22:             nvidia -gpu 0 -gpu-usage 100
11:41:44:WU00:FS01:0x22:************************************ libFAH ************************************
11:41:44:WU00:FS01:0x22:       Date: Sep 28 2021
11:41:44:WU00:FS01:0x22:       Time: 05:53:43
11:41:44:WU00:FS01:0x22:   Revision: 44301ed97b996b63fe736bb8073f22209cb2b603
11:41:44:WU00:FS01:0x22:     Branch: HEAD
11:41:44:WU00:FS01:0x22:   Compiler: Visual C++
11:41:44:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
11:41:44:WU00:FS01:0x22:   Platform: win32 10
11:41:44:WU00:FS01:0x22:       Bits: 64
11:41:44:WU00:FS01:0x22:       Mode: Release
11:41:44:WU00:FS01:0x22:************************************ CBang *************************************
11:41:44:WU00:FS01:0x22:       Date: Sep 28 2021
11:41:44:WU00:FS01:0x22:       Time: 05:52:38
11:41:44:WU00:FS01:0x22:   Revision: 33fcfc2b3ed2195a423606a264718e31e6b3903f
11:41:44:WU00:FS01:0x22:     Branch: HEAD
11:41:44:WU00:FS01:0x22:   Compiler: Visual C++
11:41:44:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
11:41:44:WU00:FS01:0x22:   Platform: win32 10
11:41:44:WU00:FS01:0x22:       Bits: 64
11:41:44:WU00:FS01:0x22:       Mode: Release
11:41:44:WU00:FS01:0x22:************************************ System ************************************
11:41:44:WU00:FS01:0x22:        CPU: 11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz
11:41:44:WU00:FS01:0x22:     CPU ID: GenuineIntel Family 6 Model 167 Stepping 1
11:41:44:WU00:FS01:0x22:       CPUs: 16
11:41:44:WU00:FS01:0x22:     Memory: 63.87GiB
11:41:44:WU00:FS01:0x22:Free Memory: 59.49GiB
11:41:44:WU00:FS01:0x22:    Threads: WINDOWS_THREADS
11:41:44:WU00:FS01:0x22: OS Version: 6.2
11:41:44:WU00:FS01:0x22:Has Battery: true
11:41:44:WU00:FS01:0x22: On Battery: false
11:41:44:WU00:FS01:0x22: UTC Offset: -5
11:41:44:WU00:FS01:0x22:        PID: 7172
11:41:44:WU00:FS01:0x22:        CWD: C:\ProgramData\FAHClient\work
11:41:44:WU00:FS01:0x22:************************************ OpenMM ************************************
11:41:44:WU00:FS01:0x22:    Version: 7.6.0
11:41:44:WU00:FS01:0x22:********************************************************************************
11:41:44:WU00:FS01:0x22:Project: 18201 (Run 20158, Clone 1, Gen 5)
11:41:44:WU00:FS01:0x22:Unit: 0x00000000000000000000000000000000
11:41:44:WU00:FS01:0x22:Reading tar file core.xml
11:41:44:WU00:FS01:0x22:Reading tar file integrator.xml
11:41:44:WU00:FS01:0x22:Reading tar file state.xml
11:41:44:WU00:FS01:0x22:Reading tar file system.xml
11:41:45:WU00:FS01:0x22:Digital signatures verified
11:41:45:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
11:41:45:WU00:FS01:0x22:Version 0.0.18
11:41:45:WU00:FS01:0x22:  Checkpoint write interval: 25000 steps (2%) [50 total]
11:41:45:WU00:FS01:0x22:  JSON viewer frame write interval: 12500 steps (1%) [100 total]
11:41:45:WU00:FS01:0x22:  XTC frame write interval: 20000 steps (1.6%) [62 total]
11:41:45:WU00:FS01:0x22:  Global context and integrator variables write interval: disabled
11:41:45:WU00:FS01:0x22:There are 4 platforms available.
11:41:45:WU00:FS01:0x22:Platform 0: Reference
11:41:45:WU00:FS01:0x22:Platform 1: CPU
11:41:45:WU00:FS01:0x22:Platform 2: OpenCL
11:41:45:WU00:FS01:0x22:  opencl-device 0 specified
11:41:45:WU00:FS01:0x22:Platform 3: CUDA
11:41:45:WU00:FS01:0x22:  cuda-device 0 specified
11:41:50:WU01:FS01:Upload 17.99%
11:41:55:WU00:FS01:0x22:Attempting to create CUDA context:
11:41:55:WU00:FS01:0x22:  Configuring platform CUDA
11:41:56:WU01:FS01:Upload 37.21%
11:41:58:WU00:FS01:0x22:  Using CUDA and gpu 0
11:41:58:WU00:FS01:0x22:Completed 0 out of 1250000 steps (0%)
11:41:59:WU00:FS01:0x22:Checkpoint completed at step 0
11:42:02:WU01:FS01:Upload 56.17%
11:42:08:WU01:FS01:Upload 75.14%
11:42:14:WU01:FS01:Upload 94.60%
11:42:16:WU01:FS01:Upload complete
11:42:16:WU01:FS01:Server responded WORK_ACK (400)
11:42:16:WU01:FS01:Final credit estimate, 596757.00 points
11:42:16:WU01:FS01:Cleaning up
11:42:53:WU00:FS01:0x22:Completed 12500 out of 1250000 steps (1%)
11:43:47:WU00:FS01:0x22:Completed 25000 out of 1250000 steps (2%)
11:43:48:WU00:FS01:0x22:Checkpoint completed at step 25000
.............
13:11:24:WU00:FS01:0x22:Completed 1225000 out of 1250000 steps (98%)
13:11:25:WU00:FS01:0x22:Checkpoint completed at step 1225000
13:12:20:WU00:FS01:0x22:Completed 1237500 out of 1250000 steps (99%)
13:12:20:WU01:FS01:Connecting to assign1.foldingathome.org:80
13:12:20:WU01:FS01:Assigned to work server 128.252.203.11
13:12:20:WU01:FS01:Requesting new work unit for slot 01: gpu:1:0 GA102 [GeForce RTX 3080 Ti] from 128.252.203.11
13:12:20:WU01:FS01:Connecting to 128.252.203.11:8080
13:12:30:WU01:FS01:Downloading 26.53MiB
13:12:36:WU01:FS01:Download 46.17%
13:12:39:WU01:FS01:Download complete
13:12:39:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:18201 run:20971 clone:1 gen:5 core:0x22 unit:0x000000010000000500004719000051eb
13:13:14:WU00:FS01:0x22:Completed 1250000 out of 1250000 steps (100%)
13:13:14:WU00:FS01:0x22:Average performance: 31.7064 ns/day
13:13:15:WU00:FS01:0x22:Checkpoint completed at step 1250000
13:13:18:WU00:FS01:0x22:Saving result file ..\logfile_01.txt
13:13:18:WU00:FS01:0x22:Saving result file checkpointIntegrator.xml
13:13:18:WU00:FS01:0x22:Saving result file checkpointState.xml
13:13:21:WU00:FS01:0x22:Saving result file positions.xtc
13:13:22:WU00:FS01:0x22:Saving result file science.log
13:13:22:WU00:FS01:0x22:Folding@home Core Shutdown: FINISHED_UNIT
13:13:22:WU00:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
13:13:22:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:18201 run:20158 clone:1 gen:5 core:0x22 unit:0x00000001000000050000471900004ebe
13:13:22:WU00:FS01:Uploading 27.50MiB to 128.252.203.11
13:13:22:WU00:FS01:Connecting to 128.252.203.11:8080
13:13:22:WU01:FS01:Starting
13:13:22:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/win/64bit/22-0.0.18/Core_22.fah/FahCore_22.exe -dir 01 -suffix 01 -version 706 -lifeline 5368 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
13:13:22:WU01:FS01:Started FahCore on PID 6844
13:13:22:WU01:FS01:Core PID:13576
13:13:22:WU01:FS01:FahCore 0x22 started
13:13:23:WU01:FS01:0x22:*********************** Log Started 2022-01-25T13:13:22Z ***********************
13:13:23:WU01:FS01:0x22:*************************** Core22 Folding@home Core ***************************
13:13:23:WU01:FS01:0x22:       Core: Core22
13:13:23:WU01:FS01:0x22:       Type: 0x22
13:13:23:WU01:FS01:0x22:    Version: 0.0.18
13:13:23:WU01:FS01:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
13:13:23:WU01:FS01:0x22:  Copyright: 2020 foldingathome.org
13:13:23:WU01:FS01:0x22:   Homepage: https://foldingathome.org/
13:13:23:WU01:FS01:0x22:       Date: Sep 28 2021
13:13:23:WU01:FS01:0x22:       Time: 05:55:05
13:13:23:WU01:FS01:0x22:   Revision: cfe3d7d990e8f456e371f8ce63b5fcc6daab2103
13:13:23:WU01:FS01:0x22:     Branch: HEAD
13:13:23:WU01:FS01:0x22:   Compiler: Visual C++
13:13:23:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
13:13:23:WU01:FS01:0x22:             -DOPENMM_VERSION="\"7.6.0\""
13:13:23:WU01:FS01:0x22:   Platform: win32 10
13:13:23:WU01:FS01:0x22:       Bits: 64
13:13:23:WU01:FS01:0x22:       Mode: Release
13:13:23:WU01:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
13:13:23:WU01:FS01:0x22:             <peastman@stanford.edu>
13:13:23:WU01:FS01:0x22:       Args: -dir 01 -suffix 01 -version 706 -lifeline 6844 -checkpoint 15
13:13:23:WU01:FS01:0x22:             -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor
13:13:23:WU01:FS01:0x22:             nvidia -gpu 0 -gpu-usage 100
13:13:23:WU01:FS01:0x22:************************************ libFAH ************************************
13:13:23:WU01:FS01:0x22:       Date: Sep 28 2021
13:13:23:WU01:FS01:0x22:       Time: 05:53:43
13:13:23:WU01:FS01:0x22:   Revision: 44301ed97b996b63fe736bb8073f22209cb2b603
13:13:23:WU01:FS01:0x22:     Branch: HEAD
13:13:23:WU01:FS01:0x22:   Compiler: Visual C++
13:13:23:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
13:13:23:WU01:FS01:0x22:   Platform: win32 10
13:13:23:WU01:FS01:0x22:       Bits: 64
13:13:23:WU01:FS01:0x22:       Mode: Release
13:13:23:WU01:FS01:0x22:************************************ CBang *************************************
13:13:23:WU01:FS01:0x22:       Date: Sep 28 2021
13:13:23:WU01:FS01:0x22:       Time: 05:52:38
13:13:23:WU01:FS01:0x22:   Revision: 33fcfc2b3ed2195a423606a264718e31e6b3903f
13:13:23:WU01:FS01:0x22:     Branch: HEAD
13:13:23:WU01:FS01:0x22:   Compiler: Visual C++
13:13:23:WU01:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
13:13:23:WU01:FS01:0x22:   Platform: win32 10
13:13:23:WU01:FS01:0x22:       Bits: 64
13:13:23:WU01:FS01:0x22:       Mode: Release
13:13:23:WU01:FS01:0x22:************************************ System ************************************
13:13:23:WU01:FS01:0x22:        CPU: 11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz
13:13:23:WU01:FS01:0x22:     CPU ID: GenuineIntel Family 6 Model 167 Stepping 1
13:13:23:WU01:FS01:0x22:       CPUs: 16
13:13:23:WU01:FS01:0x22:     Memory: 63.87GiB
13:13:23:WU01:FS01:0x22:Free Memory: 59.65GiB
13:13:23:WU01:FS01:0x22:    Threads: WINDOWS_THREADS
13:13:23:WU01:FS01:0x22: OS Version: 6.2
13:13:23:WU01:FS01:0x22:Has Battery: true
13:13:23:WU01:FS01:0x22: On Battery: false
13:13:23:WU01:FS01:0x22: UTC Offset: -5
13:13:23:WU01:FS01:0x22:        PID: 13576
13:13:23:WU01:FS01:0x22:        CWD: C:\ProgramData\FAHClient\work
13:13:23:WU01:FS01:0x22:************************************ OpenMM ************************************
13:13:23:WU01:FS01:0x22:    Version: 7.6.0
13:13:23:WU01:FS01:0x22:********************************************************************************
13:13:23:WU01:FS01:0x22:Project: 18201 (Run 20971, Clone 1, Gen 5)
13:13:23:WU01:FS01:0x22:Unit: 0x00000000000000000000000000000000
13:13:23:WU01:FS01:0x22:Reading tar file core.xml
13:13:23:WU01:FS01:0x22:Reading tar file integrator.xml
13:13:23:WU01:FS01:0x22:Reading tar file state.xml
13:13:28:WU00:FS01:Upload 30.91%
13:13:34:WU00:FS01:Upload 66.59%
13:13:41:WU00:FS01:Upload complete
13:13:41:WU00:FS01:Server responded WORK_ACK (400)
13:13:41:WU00:FS01:Final credit estimate, 515807.00 points
13:13:41:WU00:FS01:Cleaning up
Folding since Feb 2021. 1) 4090/12900KS, 2) 4080/12700F, 3) 4070Ti/9900X, 4) 3090/12900K, 5) 3090/10940X, 6) 3080Ti/12700K, 7) 3080Ti/9900X

Image
Neil-B
Posts: 2027
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: Project: 18201 (Run 24975, Clone 0, Gen 21)

Post by Neil-B »

Can you post the bit from the top of the log which includes the System Info and Configuration sections.
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
toTOW
Site Moderator
Posts: 6296
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project: 18201 (Run 24975, Clone 0, Gen 21)

Post by toTOW »

There's nothing wrong in the log you posted ...
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Lazvon
Posts: 105
Joined: Wed Jan 05, 2022 1:06 am
Hardware configuration: 4080 / 12700F, 3090Ti/12900KS, 3090/12900K, 3090/10940X, 3080Ti/12700K, 3080Ti/9900X, 3080Ti/9900X

Re: Project: 18201 (Run 24975, Clone 0, Gen 21)

Post by Lazvon »

toTOW wrote: Fri Jan 28, 2022 9:30 pm There's nothing wrong in the log you posted ...
And yet, the machines just sit there at 0% when this happens... like others have complained about on this work unit. I will say, I've not had a lock up in the last couple of weeks (knock on wood), but I've had other periods where I don't get lock ups and then multiple days where all the various machines will get stuck at 0% on 18201 and need slot deletion/dumping WUs and trying again, and hoping to not get 18201 WUs.

I'm on my Intel Mac Mini at the moment (with an idle 6900XT eGPU, since I can't use eGPUs without running Windows - hopefully that can come someday), so can't grab the top part of the log as requested. I really don't think it'll make a difference, also new drivers by now as I keep them up to date, so perhaps it was all prior CUDA version issues.

Now have two 3080Tis, two 3090s, and a 3090Ti in the mix... and donated $1000 (a month or so ago) to the charity supporting F@H directly for all the work their doing. Why do I state this? Because this is important work to me and my family for lots of reasons. Just wish I could block Project 18201 so I never have to see it on my machines again.
Folding since Feb 2021. 1) 4090/12900KS, 2) 4080/12700F, 3) 4070Ti/9900X, 4) 3090/12900K, 5) 3090/10940X, 6) 3080Ti/12700K, 7) 3080Ti/9900X

Image
Joe_H
Site Admin
Posts: 7856
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Project: 18201 (Run 24975, Clone 0, Gen 21)

Post by Joe_H »

Wish I could say there is a permanent fix, but the problems uploading to the server for this project appear to be intermittent and vary by location of the machine doing the upload. For example, I have never had a problem uploading WUs to that server. That is both on my current ISP and the previous ISP I used.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Lazvon
Posts: 105
Joined: Wed Jan 05, 2022 1:06 am
Hardware configuration: 4080 / 12700F, 3090Ti/12900KS, 3090/12900K, 3090/10940X, 3080Ti/12700K, 3080Ti/9900X, 3080Ti/9900X

Re: Project: 18201 (Run 24975, Clone 0, Gen 21)

Post by Lazvon »

Thanks, Joe. I have a fair amount of advertising and anti-malware blocking setup too, so perhaps that is the issue. AT&T Gig Fiber for Internet. Pihole DNS with blocks pointing to 8.8.8.8 and 1.1.1.1, and Ubiquity Malware/etc intrusion prevention turned on.

Will try disabling those if it happens again. If that doesn’t help, will try flipping to our backup Comcast cable modem and see if that fixes it without have to delete the slot and dump. Wish just dump would work, but if I try that it just comes right back.

Do wish I could block a specific project number though.
Folding since Feb 2021. 1) 4090/12900KS, 2) 4080/12700F, 3) 4070Ti/9900X, 4) 3090/12900K, 5) 3090/10940X, 6) 3080Ti/12700K, 7) 3080Ti/9900X

Image
jjmiller
Scientist
Posts: 81
Joined: Fri Apr 09, 2021 4:43 pm

Re: Project: 18201 (Run 24975, Clone 0, Gen 21)

Post by jjmiller »

Hi Lazvon-

Sorry to hear you're having troubles with 18201. Have you got a WU stuck now?

As Joe mentions the issue with 18201 has been frustratingly difficult to pin down. Some WUs seem to hang for no apparent reason, despite the same system folding and returning 18201 WUs at the same time another is stuck. I haven't been able to affect any of these issues from the server side despite lots of engagement from our IT and SysAdmin folks.
Post Reply