Page 1 of 1

Project: 5756 (Run 10, Clone 146, Gen 223)

Posted: Wed May 06, 2009 5:28 am
by Amaruk
And now for something a little different....

Card is 8800 GT 256MB G92 at vendor clocks - 700/750/1700 (3rd of 3)

X2 5600 @ 2.9 Ghz, 4 gb PC2-6400, XP Pro32 SP2, MSI K9A2 Platinum, 6.23 Systray, 181.20 drivers

Code: Select all

[19:54:36] - Preparing to get new work unit...
[19:54:36] + Attempting to get work packet
[19:54:36] - Will indicate memory of 2815 MB
[19:54:36] - Connecting to assignment server
[19:54:36] Connecting to http://assign-GPU.stanford.edu:8080/
[19:54:36] Posted data.
[19:54:36] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[19:54:36] + News From Folding@Home: GPU folding beta
[19:54:36] Loaded queue successfully.
[19:54:36] Connecting to http://171.67.108.11:8080/
[19:54:36] Posted data.
[19:54:36] Initial: 0000; - Receiving payload (expected size: 99259)
[19:54:37] - Downloaded at ~96 kB/s
[19:54:37] - Averaged speed for that direction ~105 kB/s
[19:54:37] + Received work.
[19:54:37] Trying to send all finished work units
[19:54:37] + No unsent completed units remaining.
[19:54:37] + Closed connections
[19:54:42] 
[19:54:42] + Processing work unit
[19:54:42] Core required: FahCore_11.exe
[19:54:42] Core found.
[19:54:42] Working on queue slot 00 [May 5 19:54:42 UTC]
[19:54:42] + Working ...
[19:54:42] - Calling '.\FahCore_11.exe -dir work/ -suffix 00 -priority 96 -checkpoint 15 -verbose -lifeline 3616 -version 623'

[19:54:42] 
[19:54:42] *------------------------------*
[19:54:42] Folding@Home GPU Core - Beta
[19:54:42] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[19:54:42] 
[19:54:42] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[19:54:42] Build host: amoeba
[19:54:42] Board Type: Nvidia
[19:54:42] Core      : 
[19:54:42] Preparing to commence simulation
[19:54:42] - Looking at optimizations...
[19:54:42] - Created dyn
[19:54:42] - Files status OK
[19:54:42] - Expanded 98747 -> 492276 (decompressed 498.5 percent)
[19:54:42] Called DecompressByteArray: compressed_data_size=98747 data_size=492276, decompressed_data_size=492276 diff=0
[19:54:42] - Digital signature verified
[19:54:42] 
[19:54:42] Project: 5756 (Run 10, Clone 146, Gen 223)
[19:54:42] 
[19:54:42] Assembly optimizations on if available.
[19:54:42] Entering M.D.
[19:54:48] Working on Protein
[19:54:52] Client config found, loading data.
[19:54:52] Starting GUI Server
[19:57:05] Completed 1%
[19:59:18] Completed 2%
[20:01:30] Completed 3%
[20:03:43] Completed 4%
[20:05:55] Completed 5%
[20:08:08] Completed 6%
[20:10:21] Completed 7%
[20:12:33] Completed 8%
[20:14:46] Completed 9%
[20:16:58] Completed 10%
[20:19:12] Completed 11%
[20:21:24] Completed 12%
[20:23:37] Completed 13%
[20:25:50] Completed 14%
[20:28:02] Completed 15%
[20:30:15] Completed 16%
[20:32:28] Completed 17%
[20:34:40] Completed 18%
[20:36:53] Completed 19%
[20:39:05] Completed 20%
[20:41:18] Completed 21%
[20:43:31] Completed 22%
[20:45:43] Completed 23%
[20:47:56] Completed 24%
[20:50:09] Completed 25%
[20:52:21] Completed 26%
[20:54:34] Completed 27%
[20:56:47] Completed 28%
[20:58:59] Completed 29%
[21:01:12] Completed 30%
[21:03:24] Completed 31%
[21:05:38] Completed 32%
[21:07:50] Completed 33%
[21:10:03] Completed 34%
[21:12:15] Completed 35%
[21:14:28] Completed 36%
[21:16:40] Completed 37%
[21:18:53] Completed 38%
[21:21:06] Completed 39%
[21:23:18] Completed 40%
[21:25:31] Completed 41%
[21:27:44] Completed 42%
[21:29:57] Completed 43%
[21:32:09] Completed 44%
[21:34:22] Completed 45%
[21:36:35] Completed 46%
[21:38:47] Completed 47%
[21:41:00] Completed 48%
[21:43:12] Completed 49%
[21:45:25] Completed 50%
[21:47:38] Completed 51%
[21:49:50] Completed 52%
[21:52:03] Completed 53%
[21:54:16] Completed 54%
[21:56:28] Completed 55%
[21:58:41] Completed 56%
[22:00:53] Completed 57%
[22:03:06] Completed 58%
[22:05:18] Completed 59%
[22:07:32] Completed 60%
[22:09:44] Completed 61%
[22:11:57] Completed 62%
[22:14:10] Completed 63%
[22:16:22] Completed 64%
[22:18:35] Completed 65%
[22:20:47] Completed 66%
[22:23:00] Completed 67%
[22:25:13] Completed 68%
[22:27:25] Completed 69%
[22:29:39] Completed 70%
[22:31:53] Completed 71%
[22:34:05] Completed 72%
[22:34:05] mdrun_gpu returned 
[22:34:05] Going to send back what have done -- stepsTotalG=10000000
[22:34:05] Work fraction=0.7200 steps=10000000.
[22:34:09] logfile size=22634 infoLength=22634 edr=0 trr=25
About three hours later I get home. No popups or error dialog this time. Four seconds after writing 'Going to send back what have done' everything appears to have died. No core, no client, nothing. No XP error logs around 22:34:09, and all client logs (3 GPU + 1 CPU) end abruptly.


System reset. Here is log after restart.

Code: Select all

--- Opening Log file [May 6 01:24:46 UTC] 


# Windows GPU Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.23

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Documents and Settings\Hammersmythe\Application Data\Folding@home-gpu3
Arguments: -gpu 2 -verbosity 9 -forcegpu nvidia_g80 

[01:24:46] - Ask before connecting: No
[01:24:46] - User name: Amaruk (Team 50625)
[01:24:46] - User ID: C75791562E71B25
[01:24:46] - Machine ID: 4
[01:24:46] 
[01:24:46] Loaded queue successfully.
[01:24:46] Initialization complete
[01:24:46] 
[01:24:46] + Processing work unit
[01:24:46] Core required: FahCore_11.exe
[01:24:46] Core found.
[01:24:46] - Autosending finished units... [May 6 01:24:46 UTC]
[01:24:46] Trying to send all finished work units
[01:24:46] + No unsent completed units remaining.
[01:24:46] - Autosend completed
[01:24:46] Working on queue slot 00 [May 6 01:24:46 UTC]
[01:24:46] + Working ...
[01:24:46] - Calling '.\FahCore_11.exe -dir work/ -suffix 00 -priority 96 -checkpoint 15 -verbose -lifeline 4028 -version 623'

[01:24:46] 
[01:24:46] *------------------------------*
[01:24:46] Folding@Home GPU Core - Beta
[01:24:46] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[01:24:46] 
[01:24:46] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[01:24:46] Build host: amoeba
[01:24:46] Board Type: Nvidia
[01:24:46] Core      : 
[01:24:46] Preparing to commence simulation
[01:24:46] - Looking at optimizations...
[01:24:46] - Created dyn
[01:24:46] - Files status OK
[01:24:46] Error: Missing work file=<>
[01:24:46] 
[01:24:46] Folding@home Core Shutdown: MISSING_WORK_FILES
[01:24:50] CoreStatus = 74 (116)
[01:24:50] The core could not find the work files specified. Removing from queue
[01:24:50] Deleting current work unit & continuing...
[01:24:54] Trying to send all finished work units
[01:24:54] + No unsent completed units remaining.
[01:24:54] - Preparing to get new work unit...
[01:24:54] + Attempting to get work packet
[01:24:54] - Will indicate memory of 2815 MB
[01:24:54] - Detect CPU. Vendor: AuthenticAMD, Family: 15, Model: 11, Stepping: 2
[01:24:54] - Connecting to assignment server
[01:24:54] Connecting to http://assign-GPU.stanford.edu:8080/
[01:24:55] Posted data.
[01:24:55] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[01:24:55] + News From Folding@Home: GPU folding beta
[01:24:55] Loaded queue successfully.
[01:24:55] Connecting to http://171.67.108.11:8080/
[01:24:55] Posted data.
[01:24:55] Initial: 0000; - Receiving payload (expected size: 45851)
[01:24:55] Conversation time very short, giving reduced weight in bandwidth avg
[01:24:55] - Downloaded at ~89 kB/s
[01:24:55] - Averaged speed for that direction ~103 kB/s
[01:24:55] + Received work.
[01:24:55] + Closed connections
[01:25:00] 
[01:25:00] + Processing work unit
[01:25:00] Core required: FahCore_11.exe
[01:25:00] Core found.
[01:25:00] Working on queue slot 01 [May 6 01:25:00 UTC]
[01:25:00] + Working ...
[01:25:00] - Calling '.\FahCore_11.exe -dir work/ -suffix 01 -priority 96 -checkpoint 15 -verbose -lifeline 4028 -version 623'

[01:25:02] 
[01:25:02] *------------------------------*
[01:25:02] Folding@Home GPU Core - Beta
[01:25:02] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[01:25:02] 
[01:25:02] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[01:25:02] Build host: amoeba
[01:25:02] Board Type: Nvidia
[01:25:02] Core      : 
[01:25:02] Preparing to commence simulation
[01:25:02] - Looking at optimizations...
[01:25:02] - Created dyn
[01:25:02] - Files status OK
[01:25:03] - Expanded 45339 -> 251112 (decompressed 553.8 percent)
[01:25:03] Called DecompressByteArray: compressed_data_size=45339 data_size=251112, decompressed_data_size=251112 diff=0
[01:25:03] - Digital signature verified
[01:25:03] 
[01:25:03] Project: 5772 (Run 12, Clone 310, Gen 328)
[01:25:03] 
[01:25:03] Assembly optimizations on if available.
[01:25:03] Entering M.D.
[01:25:09] Working on Protein
[01:25:10] Client config found, loading data.
[01:25:10] Starting GUI Server
[01:26:06] Completed 1%
[01:27:03] Completed 2%
[01:28:00] Completed 3%
[01:28:56] Completed 4%
[01:29:53] Completed 5%
[01:30:50] Completed 6%
[01:31:46] Completed 7%
[01:32:43] Completed 8%
[01:33:40] Completed 9%
[01:34:36] Completed 10%
[01:35:33] Completed 11%
[01:36:30] Completed 12%
[01:37:26] Completed 13%
[01:38:23] Completed 14%
[01:39:20] Completed 15%
[01:40:16] Completed 16%
[01:41:13] Completed 17%
[01:42:10] Completed 18%
[01:43:06] Completed 19%
[01:44:03] Completed 20%
[01:45:00] Completed 21%
[01:45:56] Completed 22%
[01:46:53] Completed 23%
[01:47:50] Completed 24%
[01:48:46] Completed 25%
[01:49:43] Completed 26%
[01:50:40] Completed 27%
[01:51:36] Completed 28%
[01:52:33] Completed 29%
[01:53:30] Completed 30%
[01:54:26] Completed 31%
[01:55:23] Completed 32%
[01:56:20] Completed 33%
[01:57:16] Completed 34%
[01:58:13] Completed 35%
[01:59:10] Completed 36%
[02:00:06] Completed 37%
[02:01:03] Completed 38%
[02:02:00] Completed 39%
[02:02:57] Completed 40%
[02:03:54] Completed 41%
[02:04:50] Completed 42%
[02:05:47] Completed 43%
[02:06:44] Completed 44%
[02:07:40] Completed 45%
[02:08:37] Completed 46%
[02:09:34] Completed 47%
[02:10:30] Completed 48%
[02:11:27] Completed 49%
[02:12:24] Completed 50%
[02:13:20] Completed 51%
[02:14:17] Completed 52%
[02:15:14] Completed 53%
[02:16:10] Completed 54%
[02:17:07] Completed 55%
[02:18:04] Completed 56%
[02:19:00] Completed 57%
[02:19:57] Completed 58%
[02:20:54] Completed 59%
[02:21:50] Completed 60%
[02:22:47] Completed 61%
[02:23:44] Completed 62%
[02:24:40] Completed 63%
[02:25:37] Completed 64%
[02:26:34] Completed 65%
[02:27:30] Completed 66%
[02:28:27] Completed 67%
[02:29:24] Completed 68%
[02:30:20] Completed 69%
[02:31:17] Completed 70%
[02:32:14] Completed 71%
[02:33:10] Completed 72%
[02:34:07] Completed 73%
[02:35:04] Completed 74%
[02:36:01] Completed 75%
[02:36:57] Completed 76%
[02:37:54] Completed 77%
[02:38:51] Completed 78%
[02:39:47] Completed 79%
[02:40:44] Completed 80%
[02:41:41] Completed 81%
[02:42:37] Completed 82%
[02:43:34] Completed 83%
[02:44:31] Completed 84%
[02:45:27] Completed 85%
[02:46:24] Completed 86%
[02:47:21] Completed 87%
[02:48:18] Completed 88%
[02:49:14] Completed 89%
[02:50:11] Completed 90%
[02:51:08] Completed 91%
[02:52:04] Completed 92%
[02:53:01] Completed 93%
[02:53:58] Completed 94%
[02:54:54] Completed 95%
[02:55:51] Completed 96%
[02:56:48] Completed 97%
[02:57:44] Completed 98%
[02:58:41] Completed 99%
[02:59:38] Completed 100%
[02:59:38] Successful run
[02:59:38] DynamicWrapper: Finished Work Unit: sleep=10000
[02:59:48] Reserved 76172 bytes for xtc file; Cosm status=0
[02:59:48] Allocated 76172 bytes for xtc file
[02:59:48] - Reading up to 76172 from "work/wudata_01.xtc": Read 76172
[02:59:48] Read 76172 bytes from xtc file; available packet space=786354292
[02:59:48] xtc file hash check passed.
[02:59:48] Reserved 15168 15168 786354292 bytes for arc file=<work/wudata_01.trr> Cosm status=0
[02:59:48] Allocated 15168 bytes for arc file
[02:59:48] - Reading up to 15168 from "work/wudata_01.trr": Read 15168
[02:59:48] Read 15168 bytes from arc file; available packet space=786339124
[02:59:48] trr file hash check passed.
[02:59:48] Allocated 560 bytes for edr file
[02:59:48] Read bedfile
[02:59:48] edr file hash check passed.
[02:59:48] Allocated 33252 bytes for logfile
[02:59:48] Read logfile
[02:59:48] GuardedRun: success in DynamicWrapper
[02:59:48] GuardedRun: done
[02:59:48] Run: GuardedRun completed.
[02:59:52] - Writing 125664 bytes of core data to disk...
[02:59:52] Done: 125152 -> 99596 (compressed to 79.5 percent)
[02:59:52]   ... Done.
[02:59:52] - Shutting down core 
[02:59:52] 
[02:59:52] Folding@home Core Shutdown: FINISHED_UNIT
[02:59:55] CoreStatus = 64 (100)
[02:59:55] Unit 1 finished with 98 percent of time to deadline remaining.
[02:59:55] Updated performance fraction: 0.959986
[02:59:55] Sending work to server
[02:59:55] Project: 5772 (Run 12, Clone 310, Gen 328)
[02:59:55] - Read packet limit of 540015616... Set to 524286976.


[02:59:55] + Attempting to send results [May 6 02:59:55 UTC]
[02:59:55] - Reading file work/wuresults_01.dat from core
[02:59:55]   (Read 100108 bytes from disk)
[02:59:55] Connecting to http://171.67.108.11:8080/
[02:59:55] Posted data.
[02:59:56] Initial: 0000; - Uploaded at ~98 kB/s
[02:59:56] - Averaged speed for that direction ~82 kB/s
[02:59:56] + Results successfully sent
[02:59:56] Thank you for your contribution to Folding@Home.
[02:59:56] + Number of Units Completed: 2

[03:00:00] Trying to send all finished work units
[03:00:00] + No unsent completed units remaining.
[03:00:00] - Preparing to get new work unit...
[03:00:00] + Attempting to get work packet
[03:00:00] - Will indicate memory of 2815 MB
[03:00:00] - Connecting to assignment server
[03:00:00] Connecting to http://assign-GPU.stanford.edu:8080/
[03:00:00] Posted data.
[03:00:00] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[03:00:00] + News From Folding@Home: GPU folding beta
[03:00:00] Loaded queue successfully.
[03:00:00] Connecting to http://171.67.108.11:8080/
[03:00:00] Posted data.
[03:00:00] Initial: 0000; - Receiving payload (expected size: 45921)
[03:00:00] Conversation time very short, giving reduced weight in bandwidth avg
[03:00:00] - Downloaded at ~89 kB/s
[03:00:00] - Averaged speed for that direction ~102 kB/s
[03:00:00] + Received work.
[03:00:00] Trying to send all finished work units
[03:00:00] + No unsent completed units remaining.
[03:00:00] + Closed connections
[03:00:00] 
[03:00:00] + Processing work unit
[03:00:00] Core required: FahCore_11.exe
[03:00:00] Core found.
[03:00:00] Working on queue slot 02 [May 6 03:00:00 UTC]
[03:00:00] + Working ...
[03:00:00] - Calling '.\FahCore_11.exe -dir work/ -suffix 02 -priority 96 -checkpoint 15 -verbose -lifeline 4028 -version 623'

[03:00:01] 
[03:00:01] *------------------------------*
[03:00:01] Folding@Home GPU Core - Beta
[03:00:01] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[03:00:01] 
[03:00:01] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[03:00:01] Build host: amoeba
[03:00:01] Board Type: Nvidia
[03:00:01] Core      : 
[03:00:01] Preparing to commence simulation
[03:00:01] - Looking at optimizations...
[03:00:01] - Created dyn
[03:00:01] - Files status OK
[03:00:01] - Expanded 45409 -> 251112 (decompressed 553.0 percent)
[03:00:01] Called DecompressByteArray: compressed_data_size=45409 data_size=251112, decompressed_data_size=251112 diff=0
[03:00:01] - Digital signature verified
[03:00:01] 
[03:00:01] Project: 5769 (Run 13, Clone 239, Gen 255)
[03:00:01] 
[03:00:01] Assembly optimizations on if available.
[03:00:01] Entering M.D.
[03:00:07] Working on Protein
[03:00:08] Client config found, loading data.
[03:00:08] Starting GUI Server
[03:01:05] Completed 1%
[03:02:01] Completed 2%
I waited to see if it came back, but this time the server (171.67.108.11) did not give me the WU again.


Did I upload it, even though the client didn't record me doing so and erased the files at restart?

If I did not return it, why did the server not reissue it? Did anyone else finish it?


Also, this time I copied the entire App Data folder for this GPU prior to restart, in case anyone would like to have a look at any of the files.

Re: Project: 5756 (Run 10, Clone 146, Gen 223)

Posted: Wed May 06, 2009 6:16 am
by bruce
Whatever happened, the FahCore got really confused. In this case the message Error: Missing work file=<> seems to indicate that it doesn't even know the name of the missing file. Following that error, the message does say Removing from queue so it's not surprising that there is no report of an upload.
Amaruk wrote:If I did not return it, why did the server not reissue it? Did anyone else finish it?
The server logic associate with reissuing WUs is complex. You cannot draw any conclusions from the fact that you didn't get it again.

So far nobody has uploaded that WU.

Re: Project: 5756 (Run 10, Clone 146, Gen 223)

Posted: Thu May 28, 2009 7:11 am
by Amaruk
bruce wrote:The server logic associate with reissuing WUs is complex. You cannot draw any conclusions from the fact that you didn't get it again.
To the best of my knowledge/forgettory this was the first time I was not reassigned a WU that I did not return since I started my GPUs up last year.

I do agree that this doesn't mean I should have recieved it again. But in my limited experience it is unusual.

Turns out my problem has been driver related. All of the 185.xx and 181.xx are unstable, and when they died the machine went down.

Now running 178.28 without any problems. Just goes to show newer is not necessarily better. :lol:

Thanks for all your help. :D