Project: 4746 (Run 0, Clone 55, Gen 49) multiple NaNs at 43%

Moderators: Site Moderators, FAHC Science Team

Post Reply
P5-133XL
Posts: 2948
Joined: Sun Dec 02, 2007 4:36 am
Hardware configuration: Machine #1:

Intel Q9450; 2x2GB=8GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460; Windows Server 2008 X64 (SP1).

Machine #2:

Intel Q6600; 2x2GB=4GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460 video card; Windows 7 X64.

Machine 3:

Dell Dimension 8400, 3.2GHz P4 4x512GB Ram, Video card GTX 460, Windows 7 X32

I am currently folding just on the 5x GTX 460's for aprox. 70K PPD
Location: Salem. OR USA

Project: 4746 (Run 0, Clone 55, Gen 49) multiple NaNs at 43%

Post by P5-133XL »

4746 (0,55,49) is repeatedly NaN'ing at 43% and not sending to the server ...

Code: Select all

[11:48:20] Thank you for your contribution to Folding@Home.
[11:48:20] + Number of Units Completed: 166

[11:48:24] Trying to send all finished work units
[11:48:24] + No unsent completed units remaining.
[11:48:24] - Preparing to get new work unit...
[11:48:24] + Attempting to get work packet
[11:48:24] - Will indicate memory of 2046 MB
[11:48:24] - Connecting to assignment server
[11:48:24] Connecting to http://assign-GPU.stanford.edu:8080/
[11:48:25] Posted data.
[11:48:25] Initial: 40AB; - Successful: assigned to (171.64.65.103).
[11:48:25] + News From Folding@Home: GPU folding beta
[11:48:25] Loaded queue successfully.
[11:48:25] Connecting to http://171.64.65.103:8080/
[11:48:26] Posted data.
[11:48:26] Initial: 0000; - Receiving payload (expected size: 88784)
[11:48:26] Conversation time very short, giving reduced weight in bandwidth avg
[11:48:26] - Downloaded at ~173 kB/s
[11:48:26] - Averaged speed for that direction ~131 kB/s
[11:48:26] + Received work.
[11:48:26] Trying to send all finished work units
[11:48:26] + No unsent completed units remaining.
[11:48:26] + Closed connections
[11:48:26] 
[11:48:26] + Processing work unit
[11:48:26] Core required: FahCore_11.exe
[11:48:26] Core found.
[11:48:26] Working on queue slot 02 [November 23 11:48:26 UTC]
[11:48:26] + Working ...
[11:48:26] - Calling '.\FahCore_11.exe -dir work/ -suffix 02 -priority 96 -nocpulock -checkpoint 15 -verbose -lifeline 3776 -version 620'

[11:48:26] 
[11:48:26] *------------------------------*
[11:48:26] Folding@Home GPU Core - Beta
[11:48:26] Version 1.18 (Mon Oct 13 11:11:30 PDT 2008)
[11:48:26] 
[11:48:26] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[11:48:26] Build host: amoeba
[11:48:26] Board Type: AMD
[11:48:26] Core      : 
[11:48:26] Preparing to commence simulation
[11:48:26] - Looking at optimizations...
[11:48:26] - Created dyn
[11:48:26] - Files status OK
[11:48:26] - Expanded 88272 -> 447304 (decompressed 506.7 percent)
[11:48:26] Called DecompressByteArray: compressed_data_size=88272 data_size=447304, decompressed_data_size=447304 diff=0
[11:48:26] - Digital signature verified
[11:48:26] 
[11:48:26] Project: 4746 (Run 0, Clone 55, Gen 49)
[11:48:26] 
[11:48:26] Assembly optimizations on if available.
[11:48:26] Entering M.D.
[11:48:33] Working on p4746_lam5w_300K
[11:48:33] Client config found, loading data.
[11:48:33] Starting GUI Server
[11:52:25] Completed 1%
[11:56:14] Completed 2%
[12:00:04] Completed 3%

...

[14:25:12] Completed 41%
[14:29:01] Completed 42%
[14:32:49] Completed 43%
[14:32:50] mdrun_gpu returned 
[14:32:50] NANs detected on GPU
[14:32:50] 
[14:32:50] Folding@home Core Shutdown: UNSTABLE_MACHINE
[14:32:52] CoreStatus = 7A (122)
[14:32:52] Sending work to server
[14:32:52] Project: 4746 (Run 0, Clone 55, Gen 49)
[14:32:52] - Read packet limit of 540015616... Set to 524286976.
[14:32:52] - Error: Could not get length of results file work/wuresults_02.dat
[14:32:52] - Error: Could not read unit 02 file. Removing from queue.
[14:32:52] Trying to send all finished work units
[14:32:52] + No unsent completed units remaining.
[14:32:52] - Preparing to get new work unit...
[14:32:52] + Attempting to get work packet
[14:32:52] - Will indicate memory of 2046 MB
[14:32:52] - Connecting to assignment server
[14:32:52] Connecting to http://assign-GPU.stanford.edu:8080/
[14:32:54] Posted data.
[14:32:54] Initial: 40AB; - Successful: assigned to (171.64.65.103).
[14:32:54] + News From Folding@Home: GPU folding beta
[14:32:54] Loaded queue successfully.
[14:32:54] Connecting to http://171.64.65.103:8080/
[14:32:54] Posted data.
[14:32:54] Initial: 0000; - Receiving payload (expected size: 88784)
[14:32:54] Conversation time very short, giving reduced weight in bandwidth avg
[14:32:54] - Downloaded at ~173 kB/s
[14:32:54] - Averaged speed for that direction ~135 kB/s
[14:32:54] + Received work.
[14:32:54] Trying to send all finished work units
[14:32:54] + No unsent completed units remaining.
[14:32:54] + Closed connections
[14:32:59] 
[14:32:59] + Processing work unit
[14:32:59] Core required: FahCore_11.exe
[14:32:59] Core found.
[14:32:59] Working on queue slot 03 [November 23 14:32:59 UTC]
[14:32:59] + Working ...
[14:32:59] - Calling '.\FahCore_11.exe -dir work/ -suffix 03 -priority 96 -nocpulock -checkpoint 15 -verbose -lifeline 3776 -version 620'

[14:33:00] 
[14:33:00] *------------------------------*
[14:33:00] Folding@Home GPU Core - Beta
[14:33:00] Version 1.18 (Mon Oct 13 11:11:30 PDT 2008)
[14:33:00] 
[14:33:00] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[14:33:00] Build host: amoeba
[14:33:00] Board Type: AMD
[14:33:00] Core      : 
[14:33:00] Preparing to commence simulation
[14:33:00] - Looking at optimizations...
[14:33:00] - Created dyn
[14:33:00] - Files status OK
[14:33:00] - Expanded 88272 -> 447304 (decompressed 506.7 percent)
[14:33:00] Called DecompressByteArray: compressed_data_size=88272 data_size=447304, decompressed_data_size=447304 diff=0
[14:33:00] - Digital signature verified
[14:33:00] 
[14:33:00] Project: 4746 (Run 0, Clone 55, Gen 49)
[14:33:00] 
[14:33:00] Assembly optimizations on if available.
[14:33:00] Entering M.D.
[14:33:06] Working on p4746_lam5w_300K
[14:33:06] Client config found, loading data.
[14:33:06] Starting GUI Server
[14:36:59] Completed 1%
[14:40:48] Completed 2%
[14:44:37] Completed 3%

...

[17:02:50] Completed 39%
[17:06:50] Completed 40%
[17:10:53] Completed 41%
[17:13:02] - Autosending finished units... [November 23 17:13:02 UTC]
[17:13:02] Trying to send all finished work units
[17:13:02] + No unsent completed units remaining.
[17:13:02] - Autosend completed
[17:13:02] + Working...
[17:14:56] Completed 42%
[17:18:58] Completed 43%
[17:18:58] mdrun_gpu returned 
[17:18:58] NANs detected on GPU
[17:18:58] 
[17:18:58] Folding@home Core Shutdown: UNSTABLE_MACHINE
[17:19:02] CoreStatus = 7A (122)
[17:19:02] Sending work to server
[17:19:02] Project: 4746 (Run 0, Clone 55, Gen 49)
[17:19:02] - Read packet limit of 540015616... Set to 524286976.
[17:19:02] - Error: Could not get length of results file work/wuresults_03.dat
[17:19:02] - Error: Could not read unit 03 file. Removing from queue.
[17:19:02] Trying to send all finished work units
[17:19:02] + No unsent completed units remaining.
[17:19:02] - Preparing to get new work unit...
[17:19:02] + Attempting to get work packet
[17:19:02] - Will indicate memory of 2046 MB
[17:19:02] - Connecting to assignment server
[17:19:02] Connecting to http://assign-GPU.stanford.edu:8080/
[17:19:04] Posted data.
[17:19:04] Initial: 40AB; - Successful: assigned to (171.64.65.102).
[17:19:04] + News From Folding@Home: GPU folding beta
[17:19:04] Loaded queue successfully.
[17:19:04] Connecting to http://171.64.65.102:8080/
[17:19:04] Posted data.
[17:19:04] Initial: 0000; - Receiving payload (expected size: 93199)
[17:19:05] - Downloaded at ~91 kB/s
[17:19:05] - Averaged speed for that direction ~126 kB/s
[17:19:05] + Received work.
[17:19:05] Trying to send all finished work units
[17:19:05] + No unsent completed units remaining.
[17:19:05] + Closed connections
[17:19:10] 
[17:19:10] + Processing work unit
[17:19:10] Core required: FahCore_11.exe
[17:19:10] Core found.
[17:19:10] Working on queue slot 04 [November 23 17:19:10 UTC]
[17:19:10] + Working ...
[17:19:10] - Calling '.\FahCore_11.exe -dir work/ -suffix 04 -priority 96 -nocpulock -checkpoint 15 -verbose -lifeline 3776 -version 620'

[17:19:10] 
[17:19:10] *------------------------------*
[17:19:10] Folding@Home GPU Core - Beta
[17:19:10] Version 1.18 (Mon Oct 13 11:11:30 PDT 2008)
[17:19:10] 
[17:19:10] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[17:19:10] Build host: amoeba
[17:19:10] Board Type: AMD
[17:19:10] Core      : 
[17:19:10] Preparing to commence simulation
[17:19:10] - Looking at optimizations...
[17:19:10] - Created dyn
[17:19:10] - Files status OK
[17:19:10] - Expanded 92687 -> 492188 (decompressed 531.0 percent)
[17:19:10] Called DecompressByteArray: compressed_data_size=92687 data_size=492188, decompressed_data_size=492188 diff=0
[17:19:10] - Digital signature verified
[17:19:10] 
[17:19:10] Project: 5733 (Run 1, Clone 39, Gen 0)
Image
shdbcamping
Posts: 81
Joined: Mon Nov 10, 2008 7:57 am
Hardware configuration: XPS 720 Q6600 9800GX2 3gig RAM
750W primary PSU 650W Aux VGA PSU

Re: 4746 (0,55,49) is repeatedly NaN'ing at 43%

Post by shdbcamping »

[quote="P5-133XL"]4746 (0,55,49) is repeatedly NaN'ing at 43% and not sending to the server ...

I have a hd3870 that I've recently put back into my XPS420. It Will not get to the GUI at all with any 47XX series WU. It used to get to the EUE waiting 24 hrs thing. I had a ATI driver problem of some kind creep up so I installed the catalyst 8.11 driver version. I still fail every thig but the 5XXX series WU's before startig gui. The good part is it retries ad has bee picking up a 5XXX WU before EUEing out.
toTOW
Site Moderator
Posts: 6429
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project: 4746 (Run 0, Clone 55, Gen 49) multiple NaNs at 43%

Post by toTOW »

There is no data for this WU in the DB :(
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
P5-133XL
Posts: 2948
Joined: Sun Dec 02, 2007 4:36 am
Hardware configuration: Machine #1:

Intel Q9450; 2x2GB=8GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460; Windows Server 2008 X64 (SP1).

Machine #2:

Intel Q6600; 2x2GB=4GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460 video card; Windows 7 X64.

Machine 3:

Dell Dimension 8400, 3.2GHz P4 4x512GB Ram, Video card GTX 460, Windows 7 X32

I am currently folding just on the 5x GTX 460's for aprox. 70K PPD
Location: Salem. OR USA

Re: 4746 (0,55,49) is repeatedly NaN'ing at 43%

Post by P5-133XL »

shdbcamping wrote:
P5-133XL wrote:4746 (0,55,49) is repeatedly NaN'ing at 43% and not sending to the server ...

I have a hd3870 that I've recently put back into my XPS420. It Will not get to the GUI at all with any 47XX series WU. It used to get to the EUE waiting 24 hrs thing. I had a ATI driver problem of some kind creep up so I installed the catalyst 8.11 driver version. I still fail every thig but the 5XXX series WU's before startig gui. The good part is it retries ad has bee picking up a 5XXX WU before EUEing out.
I'm running dual Sapphire 3870's and am generally not having any problems with EUE's. This is an exception to the norm. If you have a consistant problem with EUE's may I suggest that you try under-clocking the card: that works for some.
Image
Post Reply