Page 1 of 1

Project: 4743 (Run 9, Clone 139, Gen 10)

Posted: Tue Apr 21, 2009 7:43 pm
by shiryunaga
NANs detected on GPU :x

Code: Select all

[19:09:28] + Starting local stats count at 1
[19:09:32] Trying to send all finished work units
[19:09:32] + No unsent completed units remaining.
[19:09:32] - Preparing to get new work unit...
[19:09:32] + Attempting to get work packet
[19:09:32] - Will indicate memory of 510 MB
[19:09:32] - Detect CPU. Vendor: GenuineIntel, Family: 15, Model: 4, Stepping: 7
[19:09:32] - Connecting to assignment server
[19:09:32] Connecting to http://assign-GPU.stanford.edu:8080/
[19:09:38] Posted data.
[19:09:38] Initial: 40AB; - Successful: assigned to (171.64.65.103).
[19:09:38] + News From Folding@Home: GPU folding beta
[19:09:39] Loaded queue successfully.
[19:09:39] Connecting to http://171.64.65.103:8080/
[19:09:40] Posted data.
[19:09:40] Initial: 0000; - Receiving payload (expected size: 58026)
[19:09:45] - Downloaded at ~11 kB/s
[19:09:45] - Averaged speed for that direction ~6 kB/s
[19:09:45] + Received work.
[19:09:45] Trying to send all finished work units
[19:09:45] + No unsent completed units remaining.
[19:09:45] + Closed connections
[19:09:45] 
[19:09:45] + Processing work unit
[19:09:45] Core required: FahCore_11.exe
[19:09:45] Core found.
[19:09:45] Working on queue slot 03 [April 21 19:09:45 UTC]
[19:09:45] + Working ...
[19:09:45] - Calling '.\FahCore_11.exe -dir work/ -suffix 03 -priority 96 -nocpulock -checkpoint 30 -verbose -lifeline 2032 -version 623'

[19:09:45] 
[19:09:45] *------------------------------*
[19:09:45] Folding@Home GPU Core - Beta
[19:09:45] Version 1.24 (Mon Feb 9 11:00:12 PST 2009)
[19:09:45] 
[19:09:45] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[19:09:45] Build host: amoeba
[19:09:45] Board Type: AMD
[19:09:45] Core      : 
[19:09:45] Preparing to commence simulation
[19:09:45] - Looking at optimizations...
[19:09:45] - Created dyn
[19:09:45] - Files status OK
[19:09:45] - Expanded 57514 -> 447304 (decompressed 777.7 percent)
[19:09:45] Called DecompressByteArray: compressed_data_size=57514 data_size=447304, decompressed_data_size=447304 diff=0
[19:09:45] - Digital signature verified
[19:09:45] 
[19:09:45] Project: 4743 (Run 9, Clone 139, Gen 10)
[19:09:45] 
[19:09:45] Assembly optimizations on if available.
[19:09:45] Entering M.D.
[19:09:51] Tpr hash work/wudata_03.tpr:  2860864008 3149223251 1509629602 2141700241 551907839
[19:09:52] Working on p4743_lam5w_300K
[19:09:53] Client config found, loading data.
[19:09:54] Starting GUI Server
[19:10:03] mdrun_gpu returned 
[19:10:03] NANs detected on GPU
[19:10:03] 
[19:10:03] Folding@home Core Shutdown: UNSTABLE_MACHINE
[19:10:05] CoreStatus = 7A (122)
[19:10:05] Sending work to server
[19:10:05] Project: 4743 (Run 9, Clone 139, Gen 10)
[19:10:05] - Read packet limit of 540015616... Set to 524286976.
[19:10:05] - Error: Could not get length of results file work/wuresults_03.dat
[19:10:05] - Error: Could not read unit 03 file. Removing from queue.
[19:10:05] Trying to send all finished work units
[19:10:05] + No unsent completed units remaining.
[19:10:05] - Preparing to get new work unit...
[19:10:05] + Attempting to get work packet
[19:10:05] - Will indicate memory of 510 MB
[19:10:05] - Connecting to assignment server
[19:10:05] Connecting to http://assign-GPU.stanford.edu:8080/
[19:10:18] Posted data.
[19:10:18] Initial: 40AB; - Successful: assigned to (171.64.65.103).
[19:10:18] + News From Folding@Home: GPU folding beta
[19:10:18] Loaded queue successfully.
[19:10:18] Connecting to http://171.64.65.103:8080/
[19:10:19] Posted data.
[19:10:19] Initial: 0000; - Receiving payload (expected size: 58026)
[19:10:22] - Downloaded at ~18 kB/s
[19:10:22] - Averaged speed for that direction ~9 kB/s
[19:10:22] + Received work.
[19:10:22] Trying to send all finished work units
[19:10:22] + No unsent completed units remaining.
[19:10:22] + Closed connections
[19:10:27] 
[19:10:27] + Processing work unit
[19:10:27] Core required: FahCore_11.exe
[19:10:27] Core found.
[19:10:27] Working on queue slot 04 [April 21 19:10:27 UTC]
[19:10:27] + Working ...
[19:10:27] - Calling '.\FahCore_11.exe -dir work/ -suffix 04 -priority 96 -nocpulock -checkpoint 30 -verbose -lifeline 2032 -version 623'

[19:10:27] 
[19:10:27] *------------------------------*
[19:10:27] Folding@Home GPU Core - Beta
[19:10:27] Version 1.24 (Mon Feb 9 11:00:12 PST 2009)
[19:10:27] 
[19:10:27] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[19:10:27] Build host: amoeba
[19:10:27] Board Type: AMD
[19:10:27] Core      : 
[19:10:27] Preparing to commence simulation
[19:10:27] - Looking at optimizations...
[19:10:27] - Created dyn
[19:10:27] - Files status OK
[19:10:27] - Expanded 57514 -> 447304 (decompressed 777.7 percent)
[19:10:27] Called DecompressByteArray: compressed_data_size=57514 data_size=447304, decompressed_data_size=447304 diff=0
[19:10:27] - Digital signature verified
[19:10:27] 
[19:10:27] Project: 4743 (Run 9, Clone 139, Gen 10)
[19:10:27] 
[19:10:27] Assembly optimizations on if available.
[19:10:27] Entering M.D.
[19:10:33] Tpr hash work/wudata_04.tpr:  2860864008 3149223251 1509629602 2141700241 551907839
[19:10:33] Working on p4743_lam5w_300K
[19:10:35] Client config found, loading data.
[19:10:35] Starting GUI Server
[19:10:45] mdrun_gpu returned 
[19:10:45] NANs detected on GPU
[19:10:45] 
[19:10:45] Folding@home Core Shutdown: UNSTABLE_MACHINE
[19:10:47] CoreStatus = 7A (122)
[19:10:47] Sending work to server
[19:10:47] Project: 4743 (Run 9, Clone 139, Gen 10)
[19:10:47] - Read packet limit of 540015616... Set to 524286976.
[19:10:47] - Error: Could not get length of results file work/wuresults_04.dat
[19:10:47] - Error: Could not read unit 04 file. Removing from queue.
[19:10:47] Trying to send all finished work units
[19:10:47] + No unsent completed units remaining.
[19:10:47] - Preparing to get new work unit...
[19:10:47] + Attempting to get work packet
[19:10:47] - Will indicate memory of 510 MB
[19:10:47] - Connecting to assignment server
[19:10:47] Connecting to http://assign-GPU.stanford.edu:8080/
[19:10:48] Posted data.
[19:10:48] Initial: 40AB; - Successful: assigned to (171.64.65.103).
[19:10:48] + News From Folding@Home: GPU folding beta
[19:10:48] Loaded queue successfully.
[19:10:48] Connecting to http://171.64.65.103:8080/
[19:10:49] Posted data.
[19:10:49] Initial: 0000; - Receiving payload (expected size: 58026)
[19:10:53] - Downloaded at ~14 kB/s
[19:10:53] - Averaged speed for that direction ~10 kB/s
[19:10:53] + Received work.
[19:10:53] Trying to send all finished work units
[19:10:53] + No unsent completed units remaining.
[19:10:53] + Closed connections
[19:10:58] 
[19:10:58] + Processing work unit
[19:10:58] Core required: FahCore_11.exe
[19:10:58] Core found.
[19:10:58] Working on queue slot 05 [April 21 19:10:58 UTC]
[19:10:58] + Working ...
[19:10:58] - Calling '.\FahCore_11.exe -dir work/ -suffix 05 -priority 96 -nocpulock -checkpoint 30 -verbose -lifeline 2032 -version 623'

[19:10:58] 
[19:10:58] *------------------------------*
[19:10:58] Folding@Home GPU Core - Beta
[19:10:58] Version 1.24 (Mon Feb 9 11:00:12 PST 2009)
[19:10:58] 
[19:10:58] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[19:10:58] Build host: amoeba
[19:10:58] Board Type: AMD
[19:10:58] Core      : 
[19:10:58] Preparing to commence simulation
[19:10:58] - Looking at optimizations...
[19:10:58] - Created dyn
[19:10:58] - Files status OK
[19:10:58] - Expanded 57514 -> 447304 (decompressed 777.7 percent)
[19:10:58] Called DecompressByteArray: compressed_data_size=57514 data_size=447304, decompressed_data_size=447304 diff=0
[19:10:58] - Digital signature verified
[19:10:58] 
[19:10:58] Project: 4743 (Run 9, Clone 139, Gen 10)
[19:10:58] 
[19:10:58] Assembly optimizations on if available.
[19:10:58] Entering M.D.
[19:11:04] Tpr hash work/wudata_05.tpr:  2860864008 3149223251 1509629602 2141700241 551907839
[19:11:04] Working on p4743_lam5w_300K
[19:11:06] Client config found, loading data.
[19:11:07] Starting GUI Server
[19:11:16] mdrun_gpu returned 
[19:11:16] NANs detected on GPU
[19:11:16] 
[19:11:16] Folding@home Core Shutdown: UNSTABLE_MACHINE
[19:11:18] CoreStatus = 7A (122)
[19:11:18] Sending work to server
[19:11:18] Project: 4743 (Run 9, Clone 139, Gen 10)
[19:11:18] - Read packet limit of 540015616... Set to 524286976.
[19:11:18] - Error: Could not get length of results file work/wuresults_05.dat
[19:11:18] - Error: Could not read unit 05 file. Removing from queue.
[19:11:18] Trying to send all finished work units
[19:11:18] + No unsent completed units remaining.
[19:11:18] - Preparing to get new work unit...
[19:11:18] + Attempting to get work packet
[19:11:18] - Will indicate memory of 510 MB
[19:11:18] - Connecting to assignment server
[19:11:18] Connecting to http://assign-GPU.stanford.edu:8080/
[19:11:20] Posted data.
[19:11:20] Initial: 40AB; - Successful: assigned to (171.64.65.103).
[19:11:20] + News From Folding@Home: GPU folding beta
[19:11:20] Loaded queue successfully.
[19:11:20] Connecting to http://171.64.65.103:8080/
[19:11:21] Posted data.
[19:11:21] Initial: 0000; - Receiving payload (expected size: 58026)
[19:11:23] - Downloaded at ~28 kB/s
[19:11:23] - Averaged speed for that direction ~13 kB/s
[19:11:23] + Received work.
[19:11:23] Trying to send all finished work units
[19:11:23] + No unsent completed units remaining.
[19:11:23] + Closed connections
[19:11:28] 
[19:11:28] + Processing work unit
[19:11:28] Core required: FahCore_11.exe
[19:11:28] Core found.
[19:11:28] Working on queue slot 06 [April 21 19:11:28 UTC]
[19:11:28] + Working ...
[19:11:28] - Calling '.\FahCore_11.exe -dir work/ -suffix 06 -priority 96 -nocpulock -checkpoint 30 -verbose -lifeline 2032 -version 623'

[19:11:28] 
[19:11:28] *------------------------------*
[19:11:28] Folding@Home GPU Core - Beta
[19:11:28] Version 1.24 (Mon Feb 9 11:00:12 PST 2009)
[19:11:28] 
[19:11:28] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[19:11:28] Build host: amoeba
[19:11:28] Board Type: AMD
[19:11:28] Core      : 
[19:11:28] Preparing to commence simulation
[19:11:28] - Looking at optimizations...
[19:11:28] - Created dyn
[19:11:28] - Files status OK
[19:11:28] - Expanded 57514 -> 447304 (decompressed 777.7 percent)
[19:11:28] Called DecompressByteArray: compressed_data_size=57514 data_size=447304, decompressed_data_size=447304 diff=0
[19:11:28] - Digital signature verified
[19:11:28] 
[19:11:28] Project: 4743 (Run 9, Clone 139, Gen 10)
[19:11:28] 
[19:11:28] Assembly optimizations on if available.
[19:11:28] Entering M.D.
[19:11:34] Tpr hash work/wudata_06.tpr:  2860864008 3149223251 1509629602 2141700241 551907839
[19:11:35] Working on p4743_lam5w_300K
[19:11:36] Client config found, loading data.
[19:11:37] Starting GUI Server
[19:11:46] mdrun_gpu returned 
[19:11:46] NANs detected on GPU
[19:11:46] 
[19:11:46] Folding@home Core Shutdown: UNSTABLE_MACHINE
[19:11:49] CoreStatus = 7A (122)
[19:11:49] Sending work to server
[19:11:49] Project: 4743 (Run 9, Clone 139, Gen 10)
[19:11:49] - Read packet limit of 540015616... Set to 524286976.
[19:11:49] - Error: Could not get length of results file work/wuresults_06.dat
[19:11:49] - Error: Could not read unit 06 file. Removing from queue.
[19:11:49] Trying to send all finished work units
[19:11:49] + No unsent completed units remaining.
[19:11:49] - Preparing to get new work unit...
[19:11:49] + Attempting to get work packet
[19:11:49] - Will indicate memory of 510 MB
[19:11:49] - Connecting to assignment server
[19:11:49] Connecting to http://assign-GPU.stanford.edu:8080/
[19:11:51] Posted data.
[19:11:51] Initial: 40AB; - Successful: assigned to (171.64.65.103).
[19:11:51] + News From Folding@Home: GPU folding beta
[19:11:51] Loaded queue successfully.
[19:11:51] Connecting to http://171.64.65.103:8080/
[19:11:52] Posted data.
[19:11:52] Initial: 0000; - Receiving payload (expected size: 58026)
[19:11:55] - Downloaded at ~18 kB/s
[19:11:55] - Averaged speed for that direction ~14 kB/s
[19:11:55] + Received work.
[19:11:55] Trying to send all finished work units
[19:11:55] + No unsent completed units remaining.
[19:11:55] + Closed connections
[19:12:00] 
[19:12:00] + Processing work unit
[19:12:00] Core required: FahCore_11.exe
[19:12:00] Core found.
[19:12:00] Working on queue slot 07 [April 21 19:12:00 UTC]
[19:12:00] + Working ...
[19:12:00] - Calling '.\FahCore_11.exe -dir work/ -suffix 07 -priority 96 -nocpulock -checkpoint 30 -verbose -lifeline 2032 -version 623'

[19:12:00] 
[19:12:00] *------------------------------*
[19:12:00] Folding@Home GPU Core - Beta
[19:12:00] Version 1.24 (Mon Feb 9 11:00:12 PST 2009)
[19:12:00] 
[19:12:00] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[19:12:00] Build host: amoeba
[19:12:00] Board Type: AMD
[19:12:00] Core      : 
[19:12:00] Preparing to commence simulation
[19:12:00] - Looking at optimizations...
[19:12:00] - Created dyn
[19:12:00] - Files status OK
[19:12:00] - Expanded 57514 -> 447304 (decompressed 777.7 percent)
[19:12:00] Called DecompressByteArray: compressed_data_size=57514 data_size=447304, decompressed_data_size=447304 diff=0
[19:12:00] - Digital signature verified
[19:12:00] 
[19:12:00] Project: 4743 (Run 9, Clone 139, Gen 10)
[19:12:00] 
[19:12:00] Assembly optimizations on if available.
[19:12:00] Entering M.D.
[19:12:06] Tpr hash work/wudata_07.tpr:  2860864008 3149223251 1509629602 2141700241 551907839
[19:12:06] Working on p4743_lam5w_300K
[19:12:08] Client config found, loading data.
[19:12:08] Starting GUI Server
[19:12:18] mdrun_gpu returned 
[19:12:18] NANs detected on GPU
[19:12:18] 
[19:12:18] Folding@home Core Shutdown: UNSTABLE_MACHINE
[19:12:22] CoreStatus = 7A (122)
[19:12:22] Sending work to server
[19:12:22] Project: 4743 (Run 9, Clone 139, Gen 10)
[19:12:22] - Read packet limit of 540015616... Set to 524286976.
[19:12:22] - Error: Could not get length of results file work/wuresults_07.dat
[19:12:22] - Error: Could not read unit 07 file. Removing from queue.
[19:12:22] EUE limit exceeded. Pausing 24 hours.

Re: Project: 4743 (Run 9, Clone 139, Gen 10)

Posted: Tue Apr 21, 2009 11:49 pm
by bruce
Other people have reported an immediate EUE on this WU and I've reported it as a bad WU.

Re: Project: 4743 (Run 9, Clone 139, Gen 10)

Posted: Tue Apr 28, 2009 12:56 am
by planetclown
Today I got EUE limit exceeded again with this PRCG.

Code: Select all

[15:56:43] + Processing work unit
[15:56:43] Core required: FahCore_11.exe
[15:56:43] Core found.
[15:56:43] Working on queue slot 01 [April 27 15:56:43 UTC]
[15:56:43] + Working ...
[15:56:43] 
[15:56:43] *------------------------------*
[15:56:43] Folding@Home GPU Core - Beta
[15:56:43] Version 1.22 (Mon Dec 8 12:57:56 PST 2008)
[15:56:43] 
[15:56:43] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[15:56:43] Build host: amoeba
[15:56:43] Board Type: AMD
[15:56:43] Core      : 
[15:56:43] Preparing to commence simulation
[15:56:43] - Looking at optimizations...
[15:56:43] - Created dyn
[15:56:43] - Files status OK
[15:56:43] - Expanded 57514 -> 447304 (decompressed 777.7 percent)
[15:56:43] Called DecompressByteArray: compressed_data_size=57514 data_size=447304, decompressed_data_size=447304 diff=0
[15:56:43] - Digital signature verified
[15:56:43] 
[15:56:43] Project: 4743 (Run 9, Clone 139, Gen 10)
[15:56:43] 
[15:56:43] Assembly optimizations on if available.
[15:56:43] Entering M.D.
[15:56:49] Working on p4743_lam5w_300K
[15:56:50] Client config found, loading data.
[15:56:50] Starting GUI Server
[15:56:56] mdrun_gpu returned 
[15:56:56] NANs detected on GPU
[15:56:56] 
[15:56:56] Folding@home Core Shutdown: UNSTABLE_MACHINE
[15:56:59] CoreStatus = 7A (122)
[15:56:59] Sending work to server
[15:56:59] Project: 4743 (Run 9, Clone 139, Gen 10)
[15:56:59] - Read packet limit of 540015616... Set to 524286976.
[15:56:59] - Error: Could not get length of results file work/wuresults_01.dat
[15:56:59] - Error: Could not read unit 01 file. Removing from queue.
[15:56:59] EUE limit exceeded. Pausing 24 hours.
[00:44:32] Service stop request received.
This WU was reported bad on April 5.
http://foldingforum.org/viewtopic.php?f=19&t=9348

It's frustrating to keep getting the same bad WU's weeks after they've been reported.