Projevt 17118 on 4090

Moderators: Site Moderators, FAHC Science Team

Post Reply
PaulTV
Posts: 187
Joined: Mon Jan 25, 2021 4:53 pm
Location: Netherlands

Projevt 17118 on 4090

Post by PaulTV »

Hola,

Last night I got a job for project 17118 on my 4090, which is probably by mistake. It was ready in only 6 minutes or so, and got me just 18k points. This project is probably meant for smaller GPUs.

https://apps.foldingathome.org/wu#proje ... =338&gen=0

Cheers,
Paul
Image

Ryzen 5800X / RTX 4090 / Windows 11
Ryzen 5600X / RTX 3070 Ti / Ubuntu 20.04
Ryzen 5600 / RTX 3060 Ti / Windows 11
pyrocyborg
Posts: 36
Joined: Wed Sep 28, 2022 1:45 am
Hardware configuration: 3060 12GB, 3060 Ti, 3070, 3080, 3090, 6700 XT, 6800 XT, 6900 XT

Re: Projevt 17118 on 4090

Post by pyrocyborg »

This is a very fast benchmark, I think. Happens sometimes.
Joe_H
Site Admin
Posts: 7870
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Projevt 17118 on 4090

Post by Joe_H »

pyrocyborg wrote: Mon Oct 30, 2023 2:43 pm This is a very fast benchmark, I think. Happens sometimes.
Exactly. There are a few benchmarking projects to check performance on various GPUs. In this case the description was copy / pasted from a previous project for Core_22, but is testing Core_23.

Project description - https://stats.foldingathome.org/project/17118
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
PaulTV
Posts: 187
Joined: Mon Jan 25, 2021 4:53 pm
Location: Netherlands

Re: Projevt 17118 on 4090

Post by PaulTV »

Thanks! I should have realized it's a performance project, I guess I was still half asleep this morning when posting...
Image

Ryzen 5800X / RTX 4090 / Windows 11
Ryzen 5600X / RTX 3070 Ti / Ubuntu 20.04
Ryzen 5600 / RTX 3060 Ti / Windows 11
wdanwatts
Posts: 65
Joined: Wed Oct 22, 2008 4:46 pm

Re: Projevt 17118 on 4090

Post by wdanwatts »

How do I get my machine to stop jamming up on this project?
wdanwatts
Posts: 65
Joined: Wed Oct 22, 2008 4:46 pm

Re: Projevt 17118 on 4090

Post by wdanwatts »

This system keeps cycling

Code: Select all

23:51:31:WU00:FS00:Starting
23:51:31:WU00:FS00:Removing old file 'work/00/logfile_01-20231031-231156.txt'
23:51:31:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/openmm-core-23/centos-7.9.2009-64bit/release/0x23-8.0.3/Core_23.fah/FahCore_23 -dir 00 -suffix 01 -version 706 -lifeline 1446 -checkpoint 30 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
23:51:31:WU00:FS00:Started FahCore on PID 4787
23:51:31:WU00:FS00:Core PID:4791
23:51:31:WU00:FS00:FahCore 0x23 started
23:51:32:WU00:FS00:0x23:*********************** Log Started 2023-10-31T23:51:31Z ***********************
23:51:32:WU00:FS00:0x23:*************************** Core23 Folding@home Core ***************************
23:51:32:WU00:FS00:0x23:       Core: Core23
23:51:32:WU00:FS00:0x23:       Type: 0x23
23:51:32:WU00:FS00:0x23:    Version: 8.0.3
23:51:32:WU00:FS00:0x23:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
23:51:32:WU00:FS00:0x23:  Copyright: 2022 foldingathome.org
23:51:32:WU00:FS00:0x23:   Homepage: https://foldingathome.org/
23:51:32:WU00:FS00:0x23:       Date: Aug 3 2023
23:51:32:WU00:FS00:0x23:       Time: 08:28:22
23:51:32:WU00:FS00:0x23:   Revision: 199cb870317d05441d0a301287d9ef61254fa32b
23:51:32:WU00:FS00:0x23:     Branch: HEAD
23:51:32:WU00:FS00:0x23:   Compiler: GNU 7.5.0
23:51:32:WU00:FS00:0x23:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
23:51:32:WU00:FS00:0x23:             -fdata-sections -O3 -funroll-loops -fno-pie
23:51:32:WU00:FS00:0x23:             -DOPENMM_VERSION="\"8.0.0\""
23:51:32:WU00:FS00:0x23:   Platform: linux 5.15.0-1041-azure
23:51:32:WU00:FS00:0x23:       Bits: 64
23:51:32:WU00:FS00:0x23:       Mode: Release
23:51:32:WU00:FS00:0x23:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
23:51:32:WU00:FS00:0x23:             <peastman@stanford.edu>
23:51:32:WU00:FS00:0x23:       Args: -dir 00 -suffix 01 -version 706 -lifeline 4787 -checkpoint 30
23:51:32:WU00:FS00:0x23:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
23:51:32:WU00:FS00:0x23:             0 -gpu 0
23:51:32:WU00:FS00:0x23:************************************ libFAH ************************************
23:51:32:WU00:FS00:0x23:       Date: Aug 3 2023
23:51:32:WU00:FS00:0x23:       Time: 08:27:48
23:51:32:WU00:FS00:0x23:   Revision: 112c2234abe20611a05652defc3c7f854cbf927f
23:51:32:WU00:FS00:0x23:     Branch: HEAD
23:51:32:WU00:FS00:0x23:   Compiler: GNU 7.5.0
23:51:32:WU00:FS00:0x23:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
23:51:32:WU00:FS00:0x23:             -fdata-sections -O3 -funroll-loops -fno-pie
23:51:32:WU00:FS00:0x23:   Platform: linux 5.15.0-1041-azure
23:51:32:WU00:FS00:0x23:       Bits: 64
23:51:32:WU00:FS00:0x23:       Mode: Release
23:51:32:WU00:FS00:0x23:************************************ CBang *************************************
23:51:32:WU00:FS00:0x23:    Version: 1.7.2
23:51:32:WU00:FS00:0x23:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
23:51:32:WU00:FS00:0x23:        Org: Cauldron Development LLC
23:51:32:WU00:FS00:0x23:  Copyright: Cauldron Development LLC, 2003-2023
23:51:32:WU00:FS00:0x23:   Homepage: https://cauldrondevelopment.com/
23:51:32:WU00:FS00:0x23:    License: GPL 2+
23:51:32:WU00:FS00:0x23:       Date: Aug 3 2023
23:51:32:WU00:FS00:0x23:       Time: 08:27:30
23:51:32:WU00:FS00:0x23:   Revision: eae4b58965bdd4d54ea9eb77972674352b37a547
23:51:32:WU00:FS00:0x23:     Branch: HEAD
23:51:32:WU00:FS00:0x23:   Compiler: GNU 7.5.0
23:51:32:WU00:FS00:0x23:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
23:51:32:WU00:FS00:0x23:             -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
23:51:32:WU00:FS00:0x23:   Platform: linux 5.15.0-1041-azure
23:51:32:WU00:FS00:0x23:       Bits: 64
23:51:32:WU00:FS00:0x23:       Mode: Release
23:51:32:WU00:FS00:0x23:************************************ System ************************************
23:51:32:WU00:FS00:0x23:        CPU: AMD Phenom(tm) II X2 545 Processor
23:51:32:WU00:FS00:0x23:     CPU ID: AuthenticAMD Family 16 Model 4 Stepping 2
23:51:32:WU00:FS00:0x23:       CPUs: 2
23:51:32:WU00:FS00:0x23:     Memory: 3.81GiB
23:51:32:WU00:FS00:0x23:Free Memory: 824.84MiB
23:51:32:WU00:FS00:0x23:    Threads: POSIX_THREADS
23:51:32:WU00:FS00:0x23: OS Version: 6.5
23:51:32:WU00:FS00:0x23:Has Battery: false
23:51:32:WU00:FS00:0x23: On Battery: false
23:51:32:WU00:FS00:0x23: UTC Offset: -5
23:51:32:WU00:FS00:0x23:        PID: 4791
23:51:32:WU00:FS00:0x23:        CWD: /var/lib/fahclient/work
23:51:32:WU00:FS00:0x23:       Exec: /var/lib/fahclient/cores/cores.foldingathome.org/openmm-core-23/centos-7.9.2009-64bit/release/0x23-8.0.3/Core_23.fah/FahCore_23
23:51:32:WU00:FS00:0x23:************************************ OpenMM ************************************
23:51:32:WU00:FS00:0x23:    Version: 8.0.0
23:51:32:WU00:FS00:0x23:********************************************************************************
23:51:32:WU00:FS00:0x23:Project: 17118 (Run 2, Clone 406, Gen 0)
23:51:32:WU00:FS00:0x23:Digital signatures verified
23:51:32:WU00:FS00:0x23:Folding@home GPU Core23 Folding@home Core
23:51:32:WU00:FS00:0x23:Version 8.0.3
23:51:32:WU00:FS00:0x23:  Checkpoint write interval: 433117 steps (50%) [2 total]
23:51:32:WU00:FS00:0x23:  JSON viewer frame write interval: 8662 steps (1%) [100 total]
23:51:32:WU00:FS00:0x23:  XTC frame write interval: 866234 steps (1e+02%) [1 total]
23:51:32:WU00:FS00:0x23:  Global context and integrator variables write interval: disabled
23:51:32:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
BobWilliams757
Posts: 497
Joined: Fri Apr 03, 2020 2:22 pm
Hardware configuration: ASRock X370M PRO4
Ryzen 2400G APU
16 GB DDR4-3200
MSI GTX 1660 Super Gaming X

Re: Projevt 17118 on 4090

Post by BobWilliams757 »

wdanwatts wrote: Tue Oct 31, 2023 11:55 pm How do I get my machine to stop jamming up on this project?
It might be worth showing more of your log. I can see the last line you posted, and assume it crashed after that, but the why and/or the specifics might be down further in the log. If it doesn't go beyond that point and just restarts at least we will know that.


Also....

Does it reboot the machine, or just kill the work unit?

What OS, and basic (at least) system configuration?

Is the machine usually stable for folding?

Has it been multiple work units, every unit from that project, etc?


And anything else that might be of importance. Some of these benchmark work units are experimental to some extent, and knowing what machines they cause issues on might help them straighten things out.
Fold them if you get them!
wdanwatts
Posts: 65
Joined: Wed Oct 22, 2008 4:46 pm

Re: Projevt 17118 on 4090

Post by wdanwatts »

It got better (after a day or two). I'm runnung Fedora Linux 38 (Workstation Edition) on an AMD Phenom™ II X2 545 × 2 with a NVIDIA GeForce GTX 1660 SUPER GPU. It had been running ~ 1 million 'points' per day.
This is what the end of the problem looked like:

Code: Select all

... 05:53:01:WU00:FS00:0x23:************************************ OpenMM ************************************
05:53:01:WU00:FS00:0x23:    Version: 8.0.0
05:53:01:WU00:FS00:0x23:********************************************************************************
05:53:01:WU00:FS00:0x23:Project: 17118 (Run 2, Clone 406, Gen 0)
05:53:01:WU00:FS00:0x23:Digital signatures verified
05:53:01:WU00:FS00:0x23:Folding@home GPU Core23 Folding@home Core
05:53:01:WU00:FS00:0x23:Version 8.0.3
05:53:01:WU00:FS00:0x23:  Checkpoint write interval: 433117 steps (50%) [2 total]
05:53:01:WU00:FS00:0x23:  JSON viewer frame write interval: 8662 steps (1%) [100 total]
05:53:01:WU00:FS00:0x23:  XTC frame write interval: 866234 steps (1e+02%) [1 total]
05:53:01:WU00:FS00:0x23:  Global context and integrator variables write interval: disabled
05:53:02:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
05:53:04:WARNING:WU00:FS00:Past final deadline 2023-11-01T05:53:03Z, dumping
05:53:04:WU00:FS00:Cleaning up
05:53:04:WU00:FS00:Connecting to assign1.foldingathome.org:80
05:53:04:WU00:FS00:Assigned to work server 129.32.209.200
05:53:04:WU00:FS00:Requesting new work unit for slot 00: READY gpu:0:TU116 [GeForce GTX 1660 SUPER] from 129.32.209.200
05:53:04:WU00:FS00:Connecting to 129.32.209.200:8080
Now it is back to running jobs again. It would be useful to know how to short-circuit this problem (if it happens again) so I don't have to wait until the job times out.
BobWilliams757
Posts: 497
Joined: Fri Apr 03, 2020 2:22 pm
Hardware configuration: ASRock X370M PRO4
Ryzen 2400G APU
16 GB DDR4-3200
MSI GTX 1660 Super Gaming X

Re: Projevt 17118 on 4090

Post by BobWilliams757 »

Interesting that it just hung like that until it timed out. In any case, if you get continued problems you should report them, since it's a benchmarking project and looking for any possible issues.
Fold them if you get them!
toTOW
Site Moderator
Posts: 6309
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Projevt 17118 on 4090

Post by toTOW »

wdanwatts wrote: Tue Oct 31, 2023 11:56 pm This system keeps cycling

Code: Select all

23:51:31:WU00:FS00:Starting
23:51:31:WU00:FS00:Removing old file 'work/00/logfile_01-20231031-231156.txt'
23:51:31:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/openmm-core-23/centos-7.9.2009-64bit/release/0x23-8.0.3/Core_23.fah/FahCore_23 -dir 00 -suffix 01 -version 706 -lifeline 1446 -checkpoint 30 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
23:51:31:WU00:FS00:Started FahCore on PID 4787
23:51:31:WU00:FS00:Core PID:4791
23:51:31:WU00:FS00:FahCore 0x23 started
23:51:32:WU00:FS00:0x23:*********************** Log Started 2023-10-31T23:51:31Z ***********************
23:51:32:WU00:FS00:0x23:*************************** Core23 Folding@home Core ***************************
23:51:32:WU00:FS00:0x23:       Core: Core23
23:51:32:WU00:FS00:0x23:       Type: 0x23
23:51:32:WU00:FS00:0x23:    Version: 8.0.3
23:51:32:WU00:FS00:0x23:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
23:51:32:WU00:FS00:0x23:  Copyright: 2022 foldingathome.org
23:51:32:WU00:FS00:0x23:   Homepage: https://foldingathome.org/
23:51:32:WU00:FS00:0x23:       Date: Aug 3 2023
23:51:32:WU00:FS00:0x23:       Time: 08:28:22
23:51:32:WU00:FS00:0x23:   Revision: 199cb870317d05441d0a301287d9ef61254fa32b
23:51:32:WU00:FS00:0x23:     Branch: HEAD
23:51:32:WU00:FS00:0x23:   Compiler: GNU 7.5.0
23:51:32:WU00:FS00:0x23:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
23:51:32:WU00:FS00:0x23:             -fdata-sections -O3 -funroll-loops -fno-pie
23:51:32:WU00:FS00:0x23:             -DOPENMM_VERSION="\"8.0.0\""
23:51:32:WU00:FS00:0x23:   Platform: linux 5.15.0-1041-azure
23:51:32:WU00:FS00:0x23:       Bits: 64
23:51:32:WU00:FS00:0x23:       Mode: Release
23:51:32:WU00:FS00:0x23:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
23:51:32:WU00:FS00:0x23:             <peastman@stanford.edu>
23:51:32:WU00:FS00:0x23:       Args: -dir 00 -suffix 01 -version 706 -lifeline 4787 -checkpoint 30
23:51:32:WU00:FS00:0x23:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
23:51:32:WU00:FS00:0x23:             0 -gpu 0
23:51:32:WU00:FS00:0x23:************************************ libFAH ************************************
23:51:32:WU00:FS00:0x23:       Date: Aug 3 2023
23:51:32:WU00:FS00:0x23:       Time: 08:27:48
23:51:32:WU00:FS00:0x23:   Revision: 112c2234abe20611a05652defc3c7f854cbf927f
23:51:32:WU00:FS00:0x23:     Branch: HEAD
23:51:32:WU00:FS00:0x23:   Compiler: GNU 7.5.0
23:51:32:WU00:FS00:0x23:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
23:51:32:WU00:FS00:0x23:             -fdata-sections -O3 -funroll-loops -fno-pie
23:51:32:WU00:FS00:0x23:   Platform: linux 5.15.0-1041-azure
23:51:32:WU00:FS00:0x23:       Bits: 64
23:51:32:WU00:FS00:0x23:       Mode: Release
23:51:32:WU00:FS00:0x23:************************************ CBang *************************************
23:51:32:WU00:FS00:0x23:    Version: 1.7.2
23:51:32:WU00:FS00:0x23:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
23:51:32:WU00:FS00:0x23:        Org: Cauldron Development LLC
23:51:32:WU00:FS00:0x23:  Copyright: Cauldron Development LLC, 2003-2023
23:51:32:WU00:FS00:0x23:   Homepage: https://cauldrondevelopment.com/
23:51:32:WU00:FS00:0x23:    License: GPL 2+
23:51:32:WU00:FS00:0x23:       Date: Aug 3 2023
23:51:32:WU00:FS00:0x23:       Time: 08:27:30
23:51:32:WU00:FS00:0x23:   Revision: eae4b58965bdd4d54ea9eb77972674352b37a547
23:51:32:WU00:FS00:0x23:     Branch: HEAD
23:51:32:WU00:FS00:0x23:   Compiler: GNU 7.5.0
23:51:32:WU00:FS00:0x23:    Options: -faligned-new -std=c++11 -fsigned-char -ffunction-sections
23:51:32:WU00:FS00:0x23:             -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
23:51:32:WU00:FS00:0x23:   Platform: linux 5.15.0-1041-azure
23:51:32:WU00:FS00:0x23:       Bits: 64
23:51:32:WU00:FS00:0x23:       Mode: Release
23:51:32:WU00:FS00:0x23:************************************ System ************************************
23:51:32:WU00:FS00:0x23:        CPU: AMD Phenom(tm) II X2 545 Processor
23:51:32:WU00:FS00:0x23:     CPU ID: AuthenticAMD Family 16 Model 4 Stepping 2
23:51:32:WU00:FS00:0x23:       CPUs: 2
23:51:32:WU00:FS00:0x23:     Memory: 3.81GiB
23:51:32:WU00:FS00:0x23:Free Memory: 824.84MiB
23:51:32:WU00:FS00:0x23:    Threads: POSIX_THREADS
23:51:32:WU00:FS00:0x23: OS Version: 6.5
23:51:32:WU00:FS00:0x23:Has Battery: false
23:51:32:WU00:FS00:0x23: On Battery: false
23:51:32:WU00:FS00:0x23: UTC Offset: -5
23:51:32:WU00:FS00:0x23:        PID: 4791
23:51:32:WU00:FS00:0x23:        CWD: /var/lib/fahclient/work
23:51:32:WU00:FS00:0x23:       Exec: /var/lib/fahclient/cores/cores.foldingathome.org/openmm-core-23/centos-7.9.2009-64bit/release/0x23-8.0.3/Core_23.fah/FahCore_23
23:51:32:WU00:FS00:0x23:************************************ OpenMM ************************************
23:51:32:WU00:FS00:0x23:    Version: 8.0.0
23:51:32:WU00:FS00:0x23:********************************************************************************
23:51:32:WU00:FS00:0x23:Project: 17118 (Run 2, Clone 406, Gen 0)
23:51:32:WU00:FS00:0x23:Digital signatures verified
23:51:32:WU00:FS00:0x23:Folding@home GPU Core23 Folding@home Core
23:51:32:WU00:FS00:0x23:Version 8.0.3
23:51:32:WU00:FS00:0x23:  Checkpoint write interval: 433117 steps (50%) [2 total]
23:51:32:WU00:FS00:0x23:  JSON viewer frame write interval: 8662 steps (1%) [100 total]
23:51:32:WU00:FS00:0x23:  XTC frame write interval: 866234 steps (1e+02%) [1 total]
23:51:32:WU00:FS00:0x23:  Global context and integrator variables write interval: disabled
23:51:32:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
Try to run this command in a terminal, it will print the actual issue that is causing the INTERRUPTED errore :

Code: Select all

./var/lib/fahclient/cores/cores.foldingathome.org/openmm-core-23/centos-7.9.2009-64bit/release/0x23-8.0.3/Core_23.fah/FahCore_23 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
wdanwatts
Posts: 65
Joined: Wed Oct 22, 2008 4:46 pm

Re: Projevt 17118 on 4090

Post by wdanwatts »

On my Fedora v 39 I get

Code: Select all

./FahCore_23: error while loading shared libraries: libOpenMM.so.8.0: cannot open shared object file: No such file or directory
How do I get libOpenMM.so.8.0 ?
bikeaddict
Posts: 193
Joined: Sun May 03, 2020 1:20 am

Re: Projevt 17118 on 4090

Post by bikeaddict »

wdanwatts wrote: Wed Nov 29, 2023 12:51 pm How do I get libOpenMM.so.8.0 ?
This library and many more libOpenMM files should be downloaded by the client as part of the core files. It's here on my Fedora system:

Code: Select all

/var/lib/fahclient/cores/cores.foldingathome.org/openmm-core-23/centos-7.9.2009-64bit/release/0x23-8.0.3/Core_23.fah/libOpenMM.so.8.0
You can try removing that directory and then restart the client to make it download the core again. Also make sure there aren't directory permission problems preventing the files from being written. Also check for errors in /var/lib/fahclient/log.txt.
wdanwatts
Posts: 65
Joined: Wed Oct 22, 2008 4:46 pm

Re: Projevt 17118 on 4090

Post by wdanwatts »

A system reboot started a string of Core 22 jobs so my lack of a Core 23 library is not critical at this time.
Post Reply