
Discovered the client "sleeping" this morning, restarted and it tried a sixth time and failed before the first frame before finally uploading a different Wu,
Folding without issue since.
Moderators: Site Moderators, FAHC Science Team
Code: Select all
[21:41:10] Project: 5506 (Run 6, Clone 324, Gen 171)
[21:41:10]
[21:41:10] Assembly optimizations on if available.
[21:41:10] Entering M.D.
[21:41:17] Working on p5506_supervillin_e1
[21:41:17] Client config found, loading data.
[21:41:17] mdrun_gpu returned
[21:41:17] NANs detected on GPU
[21:41:17]
[21:41:17] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:41:20] CoreStatus = 7A (122)
I'm had EXACT same experience on on of my other clients. Also a P5800.I had the 1.19 core forced onto one of my Gpu's when it downloaded a p5800, after that the following p5506 Wu's ran 10-15% slower.....
Now whats going on?My other question would be, how come a single "faulty" Wu can be downloaded 5 times and shutdown an otherwise stable rig for 24hrs (lucky I was home over the weekend) ?
Code: Select all
[19:37:53] + Attempting to send results [November 10 19:37:53 UTC]
[19:38:10] + Results successfully sent
[19:38:10] Thank you for your contribution to Folding@Home.
[19:38:10] + Number of Units Completed: 412
[19:38:14] - Preparing to get new work unit...
[19:38:14] + Attempting to get work packet
[19:38:14] - Connecting to assignment server
[19:38:14] - Successful: assigned to (171.64.65.106).
[19:38:14] + News From Folding@Home: GPU folding beta
[19:38:14] Loaded queue successfully.
[19:38:16] + Closed connections
[19:38:16]
[19:38:16] + Processing work unit
[19:38:16] Core required: FahCore_11.exe
[19:38:16] Core found.
[19:38:16] Working on queue slot 05 [November 10 19:38:16 UTC]
[19:38:16] + Working ...
[19:38:16]
[19:38:16] *------------------------------*
[19:38:16] Folding@Home GPU Core - Beta
[19:38:16] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[19:38:16]
[19:38:16] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[19:38:16] Build host: amoeba
[19:38:16] Board Type: Nvidia
[19:38:16] Core :
[19:38:16] Preparing to commence simulation
[19:38:16] - Looking at optimizations...
[19:38:16] - Created dyn
[19:38:16] - Files status OK
[19:38:16] - Expanded 45479 -> 246249 (decompressed 541.4 percent)
[19:38:16] Called DecompressByteArray: compressed_data_size=45479 data_size=246249, decompressed_data_size=246249 diff=0
[19:38:16] - Digital signature verified
[19:38:16]
[19:38:16] Project: 5506 (Run 6, Clone 324, Gen 171)
[19:38:16]
[19:38:16] Assembly optimizations on if available.
[19:38:16] Entering M.D.
[19:38:22] Working on p5506_supervillin_e1
[19:38:23] Client config found, loading data.
[19:38:23] mdrun_gpu returned
[19:38:23] NANs detected on GPU
[19:38:23]
[19:38:23] Folding@home Core Shutdown: UNSTABLE_MACHINE
[19:38:26] CoreStatus = 7A (122)
[19:38:26] Sending work to server
[19:38:26] Project: 5506 (Run 6, Clone 324, Gen 171)
[19:38:26] - Read packet limit of 540015616... Set to 524286976.
[19:38:26] - Error: Could not get length of results file work/wuresults_05.dat
[19:38:26] - Error: Could not read unit 05 file. Removing from queue.
[19:38:26] - Preparing to get new work unit...
[19:38:26] + Attempting to get work packet
[19:38:26] - Connecting to assignment server
[19:38:27] - Successful: assigned to (171.64.65.106).
[19:38:27] + News From Folding@Home: GPU folding beta
[19:38:27] Loaded queue successfully.
[19:38:29] + Closed connections
[19:38:34]
[19:38:34] + Processing work unit
[19:38:34] Core required: FahCore_11.exe
[19:38:34] Core found.
[19:38:34] Working on queue slot 06 [November 10 19:38:34 UTC]
[19:38:34] + Working ...
[19:38:34]
[19:38:34] *------------------------------*
[19:38:34] Folding@Home GPU Core - Beta
[19:38:34] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[19:38:34]
[19:38:34] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[19:38:34] Build host: amoeba
[19:38:34] Board Type: Nvidia
[19:38:34] Core :
[19:38:34] Preparing to commence simulation
[19:38:34] - Looking at optimizations...
[19:38:34] - Created dyn
[19:38:34] - Files status OK
[19:38:34] - Expanded 45479 -> 246249 (decompressed 541.4 percent)
[19:38:34] Called DecompressByteArray: compressed_data_size=45479 data_size=246249, decompressed_data_size=246249 diff=0
[19:38:34] - Digital signature verified
[19:38:34]
[19:38:34] Project: 5506 (Run 6, Clone 324, Gen 171)
[19:38:34]
[19:38:34] Assembly optimizations on if available.
[19:38:34] Entering M.D.
[19:38:40] Working on p5506_supervillin_e1
[19:38:41] Client config found, loading data.
[19:38:41] mdrun_gpu returned
[19:38:41] NANs detected on GPU
[19:38:41]
[19:38:41] Folding@home Core Shutdown: UNSTABLE_MACHINE
[19:38:44] CoreStatus = 7A (122)
[19:38:44] Sending work to server
Code: Select all
[04:48:32] Project: 5506 (Run 6, Clone 324, Gen 171)
[04:48:32]
[04:48:32] Assembly optimizations on if available.
[04:48:32] Entering M.D.
[04:48:38] Working on p5506_supervillin_e1
[04:48:39] Client config found, loading data.
[04:48:39] mdrun_gpu returned
[04:48:39] NANs detected on GPU
[04:48:39]
[04:48:39] Folding@home Core Shutdown: UNSTABLE_MACHINE
[04:48:42] CoreStatus = 7A (122)
I'm not sure if there is a single answer to your question.Drugless wrote:As a matter of interest, (probably wrong thread for this.) how does this type of 'bad' WU affect the science of 5506? I have searched throught the threads for an answer but it's just too much to dig through. I assume this 'bad' wu is recalled, examined, fixed and put out to be processed again. If not surely it makes the entire 5506 a dud / incomplete result? If someone can guide me to a thread with the answer please do.