p5519 (4, 0, 28) Consistantly NAN's at 12%

Moderators: Site Moderators, FAHC Science Team

Post Reply
P5-133XL
Posts: 2948
Joined: Sun Dec 02, 2007 4:36 am
Hardware configuration: Machine #1:

Intel Q9450; 2x2GB=8GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460; Windows Server 2008 X64 (SP1).

Machine #2:

Intel Q6600; 2x2GB=4GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460 video card; Windows 7 X64.

Machine 3:

Dell Dimension 8400, 3.2GHz P4 4x512GB Ram, Video card GTX 460, Windows 7 X32

I am currently folding just on the 5x GTX 460's for aprox. 70K PPD
Location: Salem. OR USA

p5519 (4, 0, 28) Consistantly NAN's at 12%

Post by P5-133XL »

Q9450@3.36GHz, 8GB RAM; 2x Nvidia 9600GSO(512)
Windows Server 2008

Code: Select all

[14:36:28] 
[14:36:28] *------------------------------*
[14:36:28] Folding@Home GPU Core - Beta
[14:36:28] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[14:36:28] 
[14:36:28] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[14:36:28] Build host: amoeba
[14:36:28] Board Type: Nvidia
[14:36:28] Core      : 
[14:36:28] Preparing to commence simulation
[14:36:28] - Looking at optimizations...
[14:36:31] - Created dyn
[14:36:31] - Files status OK
[14:36:31] - Expanded 88347 -> 447224 (decompressed 506.2 percent)
[14:36:31] Called DecompressByteArray: compressed_data_size=88347 data_size=447224, decompressed_data_size=447224 diff=0
[14:36:31] - Digital signature verified
[14:36:31] 
[14:36:31] Project: 5519 (Run 4, Clone 0, Gen 28)
[14:36:31] 
[14:36:31] Assembly optimizations on if available.
[14:36:31] Entering M.D.
[14:36:42] Working on p5519_lam5w_300K
[14:36:45] Client config found, loading data.
[14:36:45] Starting GUI Server
[14:39:59] Completed 1%
[14:43:12] Completed 2%
[14:46:26] Completed 3%
[14:49:40] Completed 4%
[14:52:54] Completed 5%
[14:56:07] Completed 6%
[14:59:21] Completed 7%
[15:02:34] Completed 8%
[15:05:48] Completed 9%
[15:09:02] Completed 10%
[15:12:16] Completed 11%
[15:15:29] Completed 12%
[15:16:49] mdrun_gpu returned 
[15:16:49] NANs detected on GPU
[15:16:49] 
[15:16:49] Folding@home Core Shutdown: UNSTABLE_MACHINE
[15:16:52] CoreStatus = 7A (122)
[15:16:52] Sending work to server
[15:16:52] Project: 5519 (Run 4, Clone 0, Gen 28)
[15:16:52] - Read packet limit of 540015616... Set to 524286976.
[15:16:52] - Error: Could not get length of results file work/wuresults_08.dat
[15:16:52] - Error: Could not read unit 08 file. Removing from queue.
[15:16:52] Trying to send all finished work units
[15:16:52] + No unsent completed units remaining.
[15:16:52] - Preparing to get new work unit...
[15:16:52] + Attempting to get work packet
[15:16:52] - Will indicate memory of 8189 MB
[15:16:52] - Connecting to assignment server
[15:16:52] Connecting to http://assign-GPU.stanford.edu:8080/
[15:16:52] Posted data.
[15:16:52] Initial: 40AB; - Successful: assigned to (171.64.65.106).
[15:16:52] + News From Folding@Home: GPU folding beta
[15:16:52] Loaded queue successfully.
[15:16:52] Connecting to http://171.64.65.106:8080/
[15:16:52] Posted data.
[15:16:52] Initial: 0000; - Receiving payload (expected size: 88859)
[15:16:53] - Downloaded at ~86 kB/s
[15:16:53] - Averaged speed for that direction ~106 kB/s
[15:16:53] + Received work.
[15:16:53] Trying to send all finished work units
[15:16:53] + No unsent completed units remaining.
[15:16:53] + Closed connections
[15:16:58] 
[15:16:58] + Processing work unit
[15:16:58] Core required: FahCore_11.exe
[15:16:58] Core found.
[15:16:58] Working on queue slot 09 [January 6 15:16:58 UTC]
[15:16:58] + Working ...
[15:16:58] - Calling '.\FahCore_11.exe -dir work/ -suffix 09 -priority 96 -nocpulock -checkpoint 15 -verbose -lifeline 16476 -version 620'

[15:16:58] 
[15:16:58] *------------------------------*
[15:16:58] Folding@Home GPU Core - Beta
[15:16:58] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[15:16:58] 
[15:16:58] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[15:16:58] Build host: amoeba
[15:16:58] Board Type: Nvidia
[15:16:58] Core      : 
[15:16:58] Preparing to commence simulation
[15:16:58] - Looking at optimizations...
[15:17:01] - Created dyn
[15:17:01] - Files status OK
[15:17:01] - Expanded 88347 -> 447224 (decompressed 506.2 percent)
[15:17:01] Called DecompressByteArray: compressed_data_size=88347 data_size=447224, decompressed_data_size=447224 diff=0
[15:17:01] - Digital signature verified
[15:17:01] 
[15:17:01] Project: 5519 (Run 4, Clone 0, Gen 28)
[15:17:01] 
[15:17:02] Assembly optimizations on if available.
[15:17:02] Entering M.D.
[15:17:13] Working on p5519_lam5w_300K
[15:17:16] Client config found, loading data.
[15:17:16] Starting GUI Server
[15:20:30] Completed 1%
[15:23:44] Completed 2%
[15:26:58] Completed 3%
[15:30:12] Completed 4%
[15:33:25] Completed 5%
[15:36:39] Completed 6%
[15:39:53] Completed 7%
[15:43:07] Completed 8%
[15:46:21] Completed 9%
[15:49:35] Completed 10%
[15:52:49] Completed 11%
[15:56:03] Completed 12%
[15:57:22] mdrun_gpu returned 
[15:57:22] NANs detected on GPU
[15:57:22] 
[15:57:22] Folding@home Core Shutdown: UNSTABLE_MACHINE
[15:57:25] CoreStatus = 7A (122)
[15:57:25] Sending work to server
[15:57:25] Project: 5519 (Run 4, Clone 0, Gen 28)
[15:57:25] - Read packet limit of 540015616... Set to 524286976.
[15:57:25] - Error: Could not get length of results file work/wuresults_09.dat
[15:57:25] - Error: Could not read unit 09 file. Removing from queue.
[15:57:25] Trying to send all finished work units
[15:57:25] + No unsent completed units remaining.
[15:57:25] - Preparing to get new work unit...
[15:57:25] + Attempting to get work packet
[15:57:25] - Will indicate memory of 8189 MB
[15:57:25] - Connecting to assignment server
[15:57:25] Connecting to http://assign-GPU.stanford.edu:8080/
[15:57:25] Posted data.
[15:57:25] Initial: 40AB; - Successful: assigned to (171.64.65.106).
[15:57:25] + News From Folding@Home: GPU folding beta
[15:57:26] Loaded queue successfully.
[15:57:26] Connecting to http://171.64.65.106:8080/
[15:57:26] Posted data.
[15:57:26] Initial: 0000; - Receiving payload (expected size: 88859)
[15:57:26] Conversation time very short, giving reduced weight in bandwidth avg
[15:57:26] - Downloaded at ~173 kB/s
[15:57:26] - Averaged speed for that direction ~113 kB/s
[15:57:26] + Received work.
[15:57:26] Trying to send all finished work units
[15:57:26] + No unsent completed units remaining.
[15:57:26] + Closed connections
[15:57:31] 
[15:57:31] + Processing work unit
[15:57:31] Core required: FahCore_11.exe
[15:57:31] Core found.
[15:57:32] Working on queue slot 00 [January 6 15:57:32 UTC]
[15:57:32] + Working ...
[15:57:32] - Calling '.\FahCore_11.exe -dir work/ -suffix 00 -priority 96 -nocpulock -checkpoint 15 -verbose -lifeline 16476 -version 620'

[15:57:32] 
[15:57:32] *------------------------------*
[15:57:32] Folding@Home GPU Core - Beta
[15:57:32] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[15:57:32] 
[15:57:32] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[15:57:32] Build host: amoeba
[15:57:32] Board Type: Nvidia
[15:57:32] Core      : 
[15:57:32] Preparing to commence simulation
[15:57:32] - Looking at optimizations...
[15:57:35] - Created dyn
[15:57:35] - Files status OK
[15:57:35] - Expanded 88347 -> 447224 (decompressed 506.2 percent)
[15:57:35] Called DecompressByteArray: compressed_data_size=88347 data_size=447224, decompressed_data_size=447224 diff=0
[15:57:35] - Digital signature verified
[15:57:35] 
[15:57:35] Project: 5519 (Run 4, Clone 0, Gen 28)
[15:57:35] 
[15:57:35] Assembly optimizations on if available.
[15:57:35] Entering M.D.
[15:57:45] Working on p5519_lam5w_300K
[15:57:48] Client config found, loading data.
[15:57:48] Starting GUI Server
[16:01:02] Completed 1%
[16:04:16] Completed 2%
[16:07:30] Completed 3%
[16:10:43] Completed 4%
[16:13:58] Completed 5%
[16:17:11] Completed 6%
[16:20:25] Completed 7%
[16:23:39] Completed 8%
[16:26:52] Completed 9%
[16:30:06] Completed 10%
[16:33:20] Completed 11%
[16:36:34] Completed 12%
[16:37:52] mdrun_gpu returned 
[16:37:52] NANs detected on GPU
[16:37:52] 
[16:37:52] Folding@home Core Shutdown: UNSTABLE_MACHINE
[16:37:55] CoreStatus = 7A (122)
[16:37:55] Sending work to server
[16:37:55] Project: 5519 (Run 4, Clone 0, Gen 28)
[16:37:55] - Read packet limit of 540015616... Set to 524286976.
[16:37:55] - Error: Could not get length of results file work/wuresults_00.dat
[16:37:55] - Error: Could not read unit 00 file. Removing from queue.
[16:37:55] Trying to send all finished work units
[16:37:55] + No unsent completed units remaining.
[16:37:55] - Preparing to get new work unit...
[16:37:55] + Attempting to get work packet
[16:37:55] - Will indicate memory of 8189 MB
[16:37:55] - Connecting to assignment server
[16:37:55] Connecting to http://assign-GPU.stanford.edu:8080/
[16:37:55] Posted data.
[16:37:55] Initial: 40AB; - Successful: assigned to (171.64.65.106).
[16:37:55] + News From Folding@Home: GPU folding beta
[16:37:56] Loaded queue successfully.
[16:37:56] Connecting to http://171.64.65.106:8080/
[16:37:56] Posted data.
[16:37:56] Initial: 0000; - Receiving payload (expected size: 88859)
[16:37:57] - Downloaded at ~86 kB/s
[16:37:57] - Averaged speed for that direction ~108 kB/s
[16:37:57] + Received work.
[16:37:57] Trying to send all finished work units
[16:37:57] + No unsent completed units remaining.
[16:37:57] + Closed connections
[16:38:02] 
[16:38:02] + Processing work unit
[16:38:02] Core required: FahCore_11.exe
[16:38:02] Core found.
[16:38:02] Working on queue slot 01 [January 6 16:38:02 UTC]
[16:38:02] + Working ...
[16:38:02] - Calling '.\FahCore_11.exe -dir work/ -suffix 01 -priority 96 -nocpulock -checkpoint 15 -verbose -lifeline 16476 -version 620'

[16:38:02] 
[16:38:02] *------------------------------*
[16:38:02] Folding@Home GPU Core - Beta
[16:38:02] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[16:38:02] 
[16:38:02] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[16:38:02] Build host: amoeba
[16:38:02] Board Type: Nvidia
[16:38:02] Core      : 
[16:38:02] Preparing to commence simulation
[16:38:02] - Looking at optimizations...
[16:38:05] - Created dyn
[16:38:05] - Files status OK
[16:38:05] - Expanded 88347 -> 447224 (decompressed 506.2 percent)
[16:38:05] Called DecompressByteArray: compressed_data_size=88347 data_size=447224, decompressed_data_size=447224 diff=0
[16:38:05] - Digital signature verified
[16:38:05] 
[16:38:05] Project: 5519 (Run 4, Clone 0, Gen 28)
[16:38:05] 
[16:38:05] Assembly optimizations on if available.
[16:38:05] Entering M.D.
[16:38:16] Working on p5519_lam5w_300K
[16:38:19] Client config found, loading data.
[16:38:19] Starting GUI Server
[16:41:33] Completed 1%
[16:44:46] Completed 2%
[16:48:00] Completed 3%
[16:51:13] Completed 4%
[16:54:27] Completed 5%
[16:57:41] Completed 6%
[16:59:33] - Autosending finished units... [January 6 16:59:33 UTC]
[16:59:33] Trying to send all finished work units
[16:59:33] + No unsent completed units remaining.
[16:59:33] - Autosend completed
[16:59:33] + Working...
[17:00:55] Completed 7%
[17:04:08] Completed 8%
[17:07:22] Completed 9%
[17:10:35] Completed 10%
[17:13:49] Completed 11%
[17:17:03] Completed 12%
[17:18:22] mdrun_gpu returned 
[17:18:22] NANs detected on GPU
[17:18:22] 
[17:18:22] Folding@home Core Shutdown: UNSTABLE_MACHINE
[17:18:25] CoreStatus = 7A (122)
[17:18:25] Sending work to server
[17:18:25] Project: 5519 (Run 4, Clone 0, Gen 28)
[17:18:25] - Read packet limit of 540015616... Set to 524286976.
[17:18:25] - Error: Could not get length of results file work/wuresults_01.dat
[17:18:25] - Error: Could not read unit 01 file. Removing from queue.
[17:18:25] EUE limit exceeded. Pausing 24 hours.
Image
sortofageek
Site Admin
Posts: 3110
Joined: Fri Nov 30, 2007 8:06 pm
Location: Team Helix
Contact:

Re: p5519 (4, 0, 28) Consistantly NAN's at 12%

Post by sortofageek »

Just fyi, that WU has been successfully completed for full credit by another donor.
P5-133XL
Posts: 2948
Joined: Sun Dec 02, 2007 4:36 am
Hardware configuration: Machine #1:

Intel Q9450; 2x2GB=8GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460; Windows Server 2008 X64 (SP1).

Machine #2:

Intel Q6600; 2x2GB=4GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460 video card; Windows 7 X64.

Machine 3:

Dell Dimension 8400, 3.2GHz P4 4x512GB Ram, Video card GTX 460, Windows 7 X32

I am currently folding just on the 5x GTX 460's for aprox. 70K PPD
Location: Salem. OR USA

Re: p5519 (4, 0, 28) Consistantly NAN's at 12%

Post by P5-133XL »

interesting, in that I normally assume that a WU that consistantly EUE's/NaN's at the same place is the fault of the WU rather than the machine.

Regardless, I'm going to reboot the machine to reset the HW: Just in case it is the video card...
Image
Post Reply