
EDIT, too late the next attempt failed while I was writing this and put me into a 24 hour sleep and on restart I got it AGAIN absolutely ridiculous .
EDIT 2, Finally got something else after winning my battle with the server even though it sent it me a further 3 times. I noticed that after the 3rd delete there was no work available on the first attempt so is it likely that this one's been hanging around for a while??. Anyway, I think its very unfair to put a machine into sleep if it fails the same RCG over and over again. I understand that Pande group don't want an errant machine burning hundreds of WU's but wouldn't it be better if the sleep mode only applied if 2 different WU's EUE'd ?
Code: Select all
[22:04:16] Project: 5733 (Run 4, Clone 21, Gen 521)
[22:04:16]
[22:04:16] Assembly optimizations on if available.
[22:04:16] Entering M.D.
[22:04:22] Will resume from checkpoint file
[22:04:23] Working on Protein
[22:04:23] Client config found, loading data.
[22:04:23] Starting GUI Server
[22:04:30] Resuming from checkpoint
[22:04:30] Verified work/wudata_00.log
[22:04:30] Verified work/wudata_00.edr
[22:04:30] Verified work/wudata_00.xtc
[22:08:49] Completed 1%
[22:13:08] Completed 2%
[22:17:27] Completed 3%
[22:17:27] mdrun_gpu returned
[22:17:27] NANs detected on GPU
[22:17:27]
[22:17:27] Folding@home Core Shutdown: UNSTABLE_MACHINE
[22:17:31] CoreStatus = 7A (122)
[22:17:31] Sending work to server
[22:17:31] Project: 5733 (Run 4, Clone 21, Gen 521)
[22:17:31] - Error: Could not get length of results file work/wuresults_00.dat
[22:17:31] - Error: Could not read unit 00 file. Removing from queue.
[22:17:31] - Preparing to get new work unit...
[22:17:31] + Attempting to get work packet
[22:17:31] - Connecting to assignment server
[22:17:32] - Successful: assigned to (171.64.65.102).
[22:17:32] + News From Folding@Home: Welcome to Folding@Home
[22:17:32] Loaded queue successfully.
[22:17:35] + Closed connections
[22:17:40]
[22:17:40] + Processing work unit
[22:17:40] Core required: FahCore_11.exe
[22:17:40] Core found.
[22:17:40] Working on queue slot 01 [December 12 22:17:40 UTC]
[22:17:40] + Working ...
[22:17:40]
[22:17:40] *------------------------------*
[22:17:40] Folding@Home GPU Core - Beta
[22:17:40] Version 1.18 (Mon Oct 13 11:11:30 PDT 2008)
[22:17:40]
[22:17:40] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[22:17:40] Build host: amoeba
[22:17:40] Board Type: AMD
[22:17:40] Core :
[22:17:40] Preparing to commence simulation
[22:17:40] - Looking at optimizations...
[22:17:40] - Created dyn
[22:17:40] - Files status OK
[22:17:40] - Expanded 98537 -> 492188 (decompressed 499.4 percent)
[22:17:40] Called DecompressByteArray: compressed_data_size=98537 data_size=492188, decompressed_data_size=492188 diff=0
[22:17:40] - Digital signature verified
[22:17:40]
[22:17:40] Project: 5733 (Run 4, Clone 21, Gen 521)
[22:17:40]
[22:17:40] Assembly optimizations on if available.
[22:17:40] Entering M.D.
[22:17:47] Working on Protein
[22:17:47] Client config found, loading data.
[22:17:47] Starting GUI Server
[22:22:18] Completed 1%
[22:26:43] Completed 2%
[22:31:06] Completed 3%
[22:31:06] mdrun_gpu returned
[22:31:06] NANs detected on GPU
[22:31:06]
[22:31:06] Folding@home Core Shutdown: UNSTABLE_MACHINE
[22:31:09] CoreStatus = 7A (122)
[22:31:09] Sending work to server
[22:31:09] Project: 5733 (Run 4, Clone 21, Gen 521)
[22:31:09] - Error: Could not get length of results file work/wuresults_01.dat
[22:31:09] - Error: Could not read unit 01 file. Removing from queue.
[22:31:09] - Preparing to get new work unit...
[22:31:09] + Attempting to get work packet
[22:31:09] - Connecting to assignment server
[22:31:09] - Successful: assigned to (171.64.65.102).
[22:31:09] + News From Folding@Home: Welcome to Folding@Home
[22:31:09] Loaded queue successfully.
[22:31:11] + Closed connections
[22:31:16]
[22:31:16] + Processing work unit
[22:31:16] Core required: FahCore_11.exe
[22:31:16] Core found.
[22:31:16] Working on queue slot 02 [December 12 22:31:16 UTC]
[22:31:16] + Working ...
[22:31:17]
[22:31:17] *------------------------------*
[22:31:17] Folding@Home GPU Core - Beta
[22:31:17] Version 1.18 (Mon Oct 13 11:11:30 PDT 2008)
[22:31:17]
[22:31:17] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[22:31:17] Build host: amoeba
[22:31:17] Board Type: AMD
[22:31:17] Core :
[22:31:17] Preparing to commence simulation
[22:31:17] - Looking at optimizations...
[22:31:17] - Created dyn
[22:31:17] - Files status OK
[22:31:17] - Expanded 98537 -> 492188 (decompressed 499.4 percent)
[22:31:17] Called DecompressByteArray: compressed_data_size=98537 data_size=492188, decompressed_data_size=492188 diff=0
[22:31:17] - Digital signature verified
[22:31:17]
[22:31:17] Project: 5733 (Run 4, Clone 21, Gen 521)
[22:31:17]
[22:31:17] Assembly optimizations on if available.
[22:31:17] Entering M.D.
[22:31:23] Working on Protein
[22:31:23] Client config found, loading data.
[22:31:23] Starting GUI Server
[22:35:52] Completed 1%
[22:40:16] Completed 2%
[22:44:33] Completed 3%
[22:44:33] mdrun_gpu returned
[22:44:33] NANs detected on GPU
[22:44:33]
[22:44:33] Folding@home Core Shutdown: UNSTABLE_MACHINE
[22:44:37] CoreStatus = 7A (122)
[22:44:37] Sending work to server
[22:44:37] Project: 5733 (Run 4, Clone 21, Gen 521)
[22:44:37] - Error: Could not get length of results file work/wuresults_02.dat
[22:44:37] - Error: Could not read unit 02 file. Removing from queue.
[22:44:37] - Preparing to get new work unit...
[22:44:37] + Attempting to get work packet
[22:44:37] - Connecting to assignment server
[22:44:38] - Successful: assigned to (171.64.65.102).
[22:44:38] + News From Folding@Home: Welcome to Folding@Home
[22:44:38] Loaded queue successfully.
[22:44:40] + Closed connections
[22:44:45]
[22:44:45] + Processing work unit
[22:44:45] Core required: FahCore_11.exe
[22:44:45] Core found.
[22:44:45] Working on queue slot 03 [December 12 22:44:45 UTC]
[22:44:45] + Working ...
[22:44:45]
[22:44:45] *------------------------------*
[22:44:45] Folding@Home GPU Core - Beta
[22:44:45] Version 1.18 (Mon Oct 13 11:11:30 PDT 2008)
[22:44:45]
[22:44:45] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[22:44:45] Build host: amoeba
[22:44:45] Board Type: AMD
[22:44:45] Core :
[22:44:45] Preparing to commence simulation
[22:44:45] - Looking at optimizations...
[22:44:45] - Created dyn
[22:44:45] - Files status OK
[22:44:45] - Expanded 98537 -> 492188 (decompressed 499.4 percent)
[22:44:45] Called DecompressByteArray: compressed_data_size=98537 data_size=492188, decompressed_data_size=492188 diff=0
[22:44:45] - Digital signature verified
[22:44:45]
[22:44:45] Project: 5733 (Run 4, Clone 21, Gen 521)
[22:44:45]
[22:44:45] Assembly optimizations on if available.
[22:44:45] Entering M.D.
[22:44:52] Working on Protein
[22:44:52] Client config found, loading data.
[22:44:52] Starting GUI Server
[22:49:17] Completed 1%
[22:53:38] Completed 2%
[22:58:01] Completed 3%
[22:58:01] mdrun_gpu returned
[22:58:01] NANs detected on GPU
[22:58:01]
[22:58:01] Folding@home Core Shutdown: UNSTABLE_MACHINE
[22:58:06] CoreStatus = 7A (122)
[22:58:06] Sending work to server
[22:58:06] Project: 5733 (Run 4, Clone 21, Gen 521)
[22:58:06] - Error: Could not get length of results file work/wuresults_03.dat
[22:58:06] - Error: Could not read unit 03 file. Removing from queue.