Project: 5764 (Run 14, Clone 29, Gen 88) - UM at various %

Moderators: Site Moderators, FAHC Science Team

Post Reply
toTOW
Site Moderator
Posts: 6433
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Project: 5764 (Run 14, Clone 29, Gen 88) - UM at various %

Post by toTOW »

Here's another one :

Code: Select all

[17:21:49] Project: 5764 (Run 14, Clone 29, Gen 88)
[17:21:49] 
[17:21:49] Assembly optimizations on if available.
[17:21:49] Entering M.D.
[17:21:56] Working on Protein
[17:21:58] Client config found, loading data.
[17:21:58] Starting GUI Server
[17:22:58] Completed 1%
[...]
[18:34:35] Completed 73%
[18:34:35] mdrun_gpu returned 
[18:34:35] NANs detected on GPU
[18:34:35] 
[18:34:35] Folding@home Core Shutdown: UNSTABLE_MACHINE
[18:34:38] CoreStatus = 7A (122)
[18:34:38] Sending work to server
[18:34:38] Project: 5764 (Run 14, Clone 29, Gen 88)
[18:34:38] - Read packet limit of 540015616... Set to 524286976.
[18:34:38] - Error: Could not get length of results file work/wuresults_03.dat
[18:34:38] - Error: Could not read unit 03 file. Removing from queue.
[18:34:38] Trying to send all finished work units
[18:34:38] + No unsent completed units remaining.
[18:34:38] - Preparing to get new work unit...
[18:34:38] + Attempting to get work packet
[18:34:38] - Will indicate memory of 1022 MB
[18:34:38] - Connecting to assignment server
[18:34:38] Connecting to http://assign-GPU.stanford.edu:8080/
[18:34:39] Posted data.
[18:34:39] Initial: 40AB; - Successful: assigned to (171.64.65.106).
[18:34:39] + News From Folding@Home: GPU folding beta
[18:34:39] Loaded queue successfully.
[18:34:39] Connecting to http://171.64.65.106:8080/
[18:34:40] Posted data.
[18:34:40] Initial: 0000; - Receiving payload (expected size: 70632)
[18:34:41] - Downloaded at ~68 kB/s
[18:34:41] - Averaged speed for that direction ~51 kB/s
[18:34:41] + Received work.
[18:34:41] Trying to send all finished work units
[18:34:41] + No unsent completed units remaining.
[18:34:41] + Closed connections
[18:34:46] 
[18:34:46] + Processing work unit
[18:34:46] Core required: FahCore_11.exe
[18:34:46] Core found.
[18:34:46] Working on queue slot 04 [January 9 18:34:46 UTC]
[18:34:46] + Working ...
[18:34:46] - Calling '.\FahCore_11.exe -dir work/ -suffix 04 -priority 96 -checkpoint 15 -verbose -lifeline 1808 -version 623'

[18:34:46] 
[18:34:46] *------------------------------*
[18:34:46] Folding@Home GPU Core - Beta
[18:34:46] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[18:34:46] 
[18:34:46] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[18:34:46] Build host: amoeba
[18:34:46] Board Type: Nvidia
[18:34:46] Core      : 
[18:34:46] Preparing to commence simulation
[18:34:46] - Looking at optimizations...
[18:34:46] - Created dyn
[18:34:46] - Files status OK
[18:34:46] - Expanded 70120 -> 360060 (decompressed 513.4 percent)
[18:34:46] Called DecompressByteArray: compressed_data_size=70120 data_size=360060, decompressed_data_size=360060 diff=0
[18:34:46] - Digital signature verified
[18:34:46] 
[18:34:46] Project: 5764 (Run 14, Clone 29, Gen 88)
[18:34:46] 
[18:34:46] Assembly optimizations on if available.
[18:34:46] Entering M.D.
[18:34:53] Working on Protein
[18:34:55] Client config found, loading data.
[18:34:55] Starting GUI Server
[18:35:55] Completed 1%
[...]
[19:38:27] Completed 64%
[19:38:27] mdrun_gpu returned 
[19:38:27] NANs detected on GPU
[19:38:27] 
[19:38:27] Folding@home Core Shutdown: UNSTABLE_MACHINE
[19:38:31] CoreStatus = 7A (122)
[19:38:31] Sending work to server
[19:38:31] Project: 5764 (Run 14, Clone 29, Gen 88)
[19:38:31] - Read packet limit of 540015616... Set to 524286976.
[19:38:31] - Error: Could not get length of results file work/wuresults_04.dat
[19:38:31] - Error: Could not read unit 04 file. Removing from queue.
[19:38:31] Trying to send all finished work units
[19:38:31] + No unsent completed units remaining.
[19:38:31] - Preparing to get new work unit...
[19:38:31] + Attempting to get work packet
[19:38:31] - Will indicate memory of 1022 MB
[19:38:31] - Connecting to assignment server
[19:38:31] Connecting to http://assign-GPU.stanford.edu:8080/
[19:38:33] Posted data.
[19:38:33] Initial: 40AB; - Successful: assigned to (171.64.65.106).
[19:38:33] + News From Folding@Home: GPU folding beta
[19:38:33] Loaded queue successfully.
[19:38:33] Connecting to http://171.64.65.106:8080/
[19:38:34] Posted data.
[19:38:34] Initial: 0000; - Receiving payload (expected size: 70632)
[19:38:35] - Downloaded at ~68 kB/s
[19:38:35] - Averaged speed for that direction ~54 kB/s
[19:38:35] + Received work.
[19:38:35] Trying to send all finished work units
[19:38:35] + No unsent completed units remaining.
[19:38:35] + Closed connections
[19:38:40] 
[19:38:40] + Processing work unit
[19:38:40] Core required: FahCore_11.exe
[19:38:40] Core found.
[19:38:40] Working on queue slot 05 [January 9 19:38:40 UTC]
[19:38:40] + Working ...
[19:38:40] - Calling '.\FahCore_11.exe -dir work/ -suffix 05 -priority 96 -checkpoint 15 -verbose -lifeline 1808 -version 623'

[19:38:40] 
[19:38:40] *------------------------------*
[19:38:40] Folding@Home GPU Core - Beta
[19:38:40] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[19:38:40] 
[19:38:40] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[19:38:40] Build host: amoeba
[19:38:40] Board Type: Nvidia
[19:38:40] Core      : 
[19:38:40] Preparing to commence simulation
[19:38:40] - Looking at optimizations...
[19:38:40] - Created dyn
[19:38:40] - Files status OK
[19:38:40] - Expanded 70120 -> 360060 (decompressed 513.4 percent)
[19:38:40] Called DecompressByteArray: compressed_data_size=70120 data_size=360060, decompressed_data_size=360060 diff=0
[19:38:40] - Digital signature verified
[19:38:40] 
[19:38:40] Project: 5764 (Run 14, Clone 29, Gen 88)
[19:38:40] 
[19:38:40] Assembly optimizations on if available.
[19:38:40] Entering M.D.
[19:38:47] Working on Protein
[19:38:49] Client config found, loading data.
[19:38:49] Starting GUI Server
[19:39:49] Completed 1%
[...]
[19:54:30] Completed 16%
[19:54:30] mdrun_gpu returned 
[19:54:30] NANs detected on GPU
[19:54:30] 
[19:54:30] Folding@home Core Shutdown: UNSTABLE_MACHINE
[19:54:35] CoreStatus = 7A (122)
[19:54:35] Sending work to server
[19:54:35] Project: 5764 (Run 14, Clone 29, Gen 88)
[19:54:35] - Read packet limit of 540015616... Set to 524286976.
[19:54:35] - Error: Could not get length of results file work/wuresults_05.dat
[19:54:35] - Error: Could not read unit 05 file. Removing from queue.
[19:54:35] Trying to send all finished work units
[19:54:35] + No unsent completed units remaining.
[19:54:35] - Preparing to get new work unit...
[19:54:35] + Attempting to get work packet
[19:54:35] - Will indicate memory of 1022 MB
[19:54:35] - Connecting to assignment server
[19:54:35] Connecting to http://assign-GPU.stanford.edu:8080/
[19:54:36] Posted data.
[19:54:36] Initial: 40AB; - Successful: assigned to (171.64.65.106).
[19:54:36] + News From Folding@Home: GPU folding beta
[19:54:36] Loaded queue successfully.
[19:54:36] Connecting to http://171.64.65.106:8080/
[19:54:37] Posted data.
[19:54:37] Initial: 0000; - Receiving payload (expected size: 70632)
[19:54:39] - Downloaded at ~34 kB/s
[19:54:39] - Averaged speed for that direction ~50 kB/s
[19:54:39] + Received work.
[19:54:39] Trying to send all finished work units
[19:54:39] + No unsent completed units remaining.
[19:54:39] + Closed connections
[19:54:44] 
[19:54:44] + Processing work unit
[19:54:44] Core required: FahCore_11.exe
[19:54:44] Core found.
[19:54:44] Working on queue slot 06 [January 9 19:54:44 UTC]
[19:54:44] + Working ...
[19:54:44] - Calling '.\FahCore_11.exe -dir work/ -suffix 06 -priority 96 -checkpoint 15 -verbose -lifeline 1808 -version 623'

[19:54:44] 
[19:54:44] *------------------------------*
[19:54:44] Folding@Home GPU Core - Beta
[19:54:44] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[19:54:44] 
[19:54:44] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[19:54:44] Build host: amoeba
[19:54:44] Board Type: Nvidia
[19:54:44] Core      : 
[19:54:44] Preparing to commence simulation
[19:54:44] - Looking at optimizations...
[19:54:44] - Created dyn
[19:54:44] - Files status OK
[19:54:44] - Expanded 70120 -> 360060 (decompressed 513.4 percent)
[19:54:44] Called DecompressByteArray: compressed_data_size=70120 data_size=360060, decompressed_data_size=360060 diff=0
[19:54:44] - Digital signature verified
[19:54:44] 
[19:54:44] Project: 5764 (Run 14, Clone 29, Gen 88)
[19:54:44] 
[19:54:44] Assembly optimizations on if available.
[19:54:44] Entering M.D.
[19:54:50] Working on Protein
[19:54:52] Client config found, loading data.
[19:54:52] Starting GUI Server
[19:55:52] Completed 1%
[...]
[21:11:24] Completed 77%
[21:11:24] mdrun_gpu returned 
[21:11:24] NANs detected on GPU
[21:11:24] 
[21:11:24] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:11:28] CoreStatus = 7A (122)
[21:11:28] Sending work to server
[21:11:28] Project: 5764 (Run 14, Clone 29, Gen 88)
[21:11:28] - Read packet limit of 540015616... Set to 524286976.
[21:11:28] - Error: Could not get length of results file work/wuresults_06.dat
[21:11:28] - Error: Could not read unit 06 file. Removing from queue.

There's no data in the DB, and it fails at various % ... I have a bad feeling about my GPU :?
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
toTOW
Site Moderator
Posts: 6433
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project: 5764 (Run 14, Clone 29, Gen 88) - UM at various %

Post by toTOW »

I've finally been able to complete it on next try ... weird.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
P5-133XL
Posts: 2948
Joined: Sun Dec 02, 2007 4:36 am
Hardware configuration: Machine #1:

Intel Q9450; 2x2GB=8GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460; Windows Server 2008 X64 (SP1).

Machine #2:

Intel Q6600; 2x2GB=4GB Ram; Gigabyte GA-X48-DS4 Motherboard; PC Power and Cooling Q750 PS; 2x GTX 460 video card; Windows 7 X64.

Machine 3:

Dell Dimension 8400, 3.2GHz P4 4x512GB Ram, Video card GTX 460, Windows 7 X32

I am currently folding just on the 5x GTX 460's for aprox. 70K PPD
Location: Salem. OR USA

Re: Project: 5764 (Run 14, Clone 29, Gen 88) - UM at various %

Post by P5-133XL »

I am finding for both my ATI and my Nvidia cards that a periodic reboot seems to be necessary. Otherwise they will start having problems it continues till the reboot. I also find that they seem to act up faster, the more they are used for other purposes like non-3d gaming like solitare. They really prefer being dedicated folders with a weekly reboot.

I would pin down the problem, but so far I haven't seen a better pattern.
Image
toTOW
Site Moderator
Posts: 6433
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project: 5764 (Run 14, Clone 29, Gen 88) - UM at various %

Post by toTOW »

I did a complete shutdown of the machine with power off from the PSU rear button, but it didn't help. (the report in this thread is after the shutdown, and the one in the other thread was before)
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Post Reply