Project: 5756 (Run 4, Clone 247, Gen 431)

Moderators: Site Moderators, FAHC Science Team

Post Reply
geokilla
Posts: 64
Joined: Sun Mar 08, 2009 4:36 am
Hardware configuration: Intel Core i5-10600KF @ 4.9Ghz @ 1.25V
MSI Z490 Gaming Edge Wi-Fi BIOS v17
XPG D50 32GB DDR4-3200 16-19-9-36 2T (Samsung M-Die)
XPG S11 Pro 1TB and Western Digital WD140EDFZ 14TB
EVGA RTX 3060 XC
Corsair RM650x
Phantek P360A with Noctua Exhaust Fans
Location: Toronto, Canada

Project: 5756 (Run 4, Clone 247, Gen 431)

Post by geokilla »

Got the same WU 2 times in a row. Both errored out at 7%.

Code: Select all

[23:25:09] 
[23:25:09] *------------------------------*
[23:25:09] Folding@Home GPU Core - Beta
[23:25:09] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[23:25:09] 
[23:25:09] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[23:25:09] Build host: amoeba
[23:25:09] Board Type: Nvidia
[23:25:09] Core      : 
[23:25:09] Preparing to commence simulation
[23:25:09] - Looking at optimizations...
[23:25:09] - Created dyn
[23:25:09] - Files status OK
[23:25:09] - Expanded 98733 -> 492276 (decompressed 498.5 percent)
[23:25:09] Called DecompressByteArray: compressed_data_size=98733 data_size=492276, decompressed_data_size=492276 diff=0
[23:25:09] - Digital signature verified
[23:25:09] 
[23:25:09] Project: 5756 (Run 4, Clone 247, Gen 431)
[23:25:09] 
[23:25:09] Assembly optimizations on if available.
[23:25:09] Entering M.D.
[23:25:16] Working on Protein
[23:25:19] Client config found, loading data.
[23:25:19] Starting GUI Server
[23:28:36] Completed 1%
[23:32:03] Completed 2%
[23:35:37] Completed 3%
[23:39:13] Completed 4%
[23:42:47] Completed 5%
[23:46:24] Completed 6%
[23:49:27] Completed 7%
[23:49:29] mdrun_gpu returned 
[23:49:29] NANs detected on GPU
[23:49:29] 
[23:49:29] Folding@home Core Shutdown: UNSTABLE_MACHINE
[23:49:32] CoreStatus = 7A (122)
[23:49:32] Sending work to server
[23:49:32] Project: 5756 (Run 4, Clone 247, Gen 431)
[23:49:32] - Error: Could not get length of results file work/wuresults_09.dat
[23:49:32] - Error: Could not read unit 09 file. Removing from queue.
[23:49:32] Trying to send all finished work units
[23:49:32] + No unsent completed units remaining.
[23:49:32] - Preparing to get new work unit...
[23:49:32] + Attempting to get work packet
[23:49:32] - Will indicate memory of 2046 MB
[23:49:32] - Connecting to assignment server
[23:49:32] Connecting to http://assign-GPU.stanford.edu:8080/
[23:49:32] Posted data.
[23:49:32] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[23:49:32] + News From Folding@Home: Welcome to Folding@Home
[23:49:33] Loaded queue successfully.
[23:49:33] Connecting to http://171.67.108.11:8080/
[23:49:34] Posted data.
[23:49:34] Initial: 0000; - Receiving payload (expected size: 99245)
[23:49:34] Conversation time very short, giving reduced weight in bandwidth avg
[23:49:34] - Downloaded at ~193 kB/s
[23:49:34] - Averaged speed for that direction ~137 kB/s
[23:49:34] + Received work.
[23:49:34] Trying to send all finished work units
[23:49:34] + No unsent completed units remaining.
[23:49:34] + Closed connections
[23:49:39] 
[23:49:39] + Processing work unit
[23:49:39] Core required: FahCore_11.exe
[23:49:39] Core found.
[23:49:39] Working on queue slot 00 [July 23 23:49:39 UTC]
[23:49:39] + Working ...
[23:49:39] - Calling '.\FahCore_11.exe -dir work/ -suffix 00 -checkpoint 15 -verbose -lifeline 2592 -version 623'

[23:49:39] 
[23:49:39] *------------------------------*
[23:49:39] Folding@Home GPU Core - Beta
[23:49:39] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[23:49:39] 
[23:49:39] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[23:49:39] Build host: amoeba
[23:49:39] Board Type: Nvidia
[23:49:39] Core      : 
[23:49:39] Preparing to commence simulation
[23:49:39] - Looking at optimizations...
[23:49:39] - Created dyn
[23:49:39] - Files status OK
[23:49:39] - Expanded 98733 -> 492276 (decompressed 498.5 percent)
[23:49:39] Called DecompressByteArray: compressed_data_size=98733 data_size=492276, decompressed_data_size=492276 diff=0
[23:49:39] - Digital signature verified
[23:49:39] 
[23:49:39] Project: 5756 (Run 4, Clone 247, Gen 431)
[23:49:39] 
[23:49:39] Assembly optimizations on if available.
[23:49:39] Entering M.D.
[23:49:46] Working on Protein
[23:49:49] Client config found, loading data.
[23:49:49] Starting GUI Server
[23:53:24] Completed 1%
[23:57:00] Completed 2%
[00:00:36] Completed 3%
[00:04:05] Completed 4%
[00:07:41] Completed 5%
[00:11:18] Completed 6%
[00:14:51] Completed 7%
[00:14:53] mdrun_gpu returned 
[00:14:53] NANs detected on GPU
[00:14:53] 
[00:14:53] Folding@home Core Shutdown: UNSTABLE_MACHINE
[00:14:56] CoreStatus = 7A (122)
[00:14:56] Sending work to server
[00:14:56] Project: 5756 (Run 4, Clone 247, Gen 431)
[00:14:56] - Error: Could not get length of results file work/wuresults_00.dat
[00:14:56] - Error: Could not read unit 00 file. Removing from queue.
[00:14:56] Trying to send all finished work units
[00:14:56] + No unsent completed units remaining.
[00:14:56] - Preparing to get new work unit...
[00:14:56] + Attempting to get work packet
[00:14:56] - Will indicate memory of 2046 MB
[00:14:56] - Connecting to assignment server
[00:14:56] Connecting to http://assign-GPU.stanford.edu:8080/
[00:14:56] Posted data.
[00:14:56] Initial: 40AB; - Successful: assigned to (171.64.65.20).
[00:14:56] + News From Folding@Home: Welcome to Folding@Home
[00:14:57] Loaded queue successfully.
[00:14:57] Connecting to http://171.64.65.20:8080/
[00:14:58] Posted data.
[00:14:58] Initial: 0000; - Receiving payload (expected size: 69153)
[00:14:58] Conversation time very short, giving reduced weight in bandwidth avg
[00:14:58] - Downloaded at ~135 kB/s
[00:14:58] - Averaged speed for that direction ~137 kB/s
[00:14:58] + Received work.
[00:14:58] Trying to send all finished work units
[00:14:58] + No unsent completed units remaining.
[00:14:58] + Closed connections
[00:15:03] 
Intel Core i5-10600KF @ 4.9Ghz @ 1.25V
MSI Z490 Gaming Edge Wi-Fi BIOS v17
XPG D50 32GB DDR4-3200 16-19-9-36 2T (Samsung M-Die)
XPG S11 Pro 1TB and Western Digital WD140EDFZ 14TB
ASUS TUF RTX 3070 OC
Corsair RM650x
Phantek P360A with Noctua Exhaust Fans
noprob
Posts: 31
Joined: Sun Mar 09, 2008 2:48 am
Hardware configuration: borgs
Location: mountains of West Virginia U.S.of A.
Contact:

Re: Project: 5756 (Run 4, Clone 247, Gen 431)

Post by noprob »

Code: Select all

[18:10:41] Project: 5756 (Run 6, Clone 297, Gen 420)
[18:10:41] 
[18:10:41] Assembly optimizations on if available.
[18:10:41] Entering M.D.
[18:10:47] Tpr hash work/wudata_00.tpr:  81026764 978733818 153880425 2905441614 2920082832
[18:10:47] 
[18:10:47] Calling fah_main args: 14 usage=100
[18:10:47] 
[18:10:47] Working on Protein
[18:10:53] Client config found, loading data.
[18:10:53] Starting GUI Server
[18:10:53] mdrun_gpu returned 
[18:10:53] SHAKE violations on GPU
[18:10:53] 
[18:10:53] Folding@home Core Shutdown: UNSTABLE_MACHINE
There are bad WU's?

This also happened on a few other WU in this class using the same experimental core causing
EUE limit exceeded. Pausing 24 hours. (different error messages)

any way I had forgot to adjust this experimental core as suggested, I rebooted with the suggested settings and have had no more issues with this type or class of WU (crossing fingers)

spec's located in this post near the bottom
Image
Post Reply