Project: 6013 (Run 0, Clone 78, Gen 91) - will not meet dead

Moderators: Site Moderators, FAHC Science Team

Post Reply
goodyca
Posts: 187
Joined: Sun Dec 02, 2007 12:36 pm

Project: 6013 (Run 0, Clone 78, Gen 91) - will not meet dead

Post by goodyca »

One of my clients received the subject wu this morning at 06:11:11 UTC. It took 5 hours to complete 5% of the unit. The wu has at 3 day deadline. At 1% per hour, the deadline will not be met. This client normally completes project 6013 wu's in about 8 hours. I deleted the subject wu, but the server gave the client the same wu for the new project. I am running the Linux version 6.29 client.

What should I do with the subject wu ?

I have included the relevent part of the log file.

Code: Select all

--- Opening Log file [April 17 15:05:27 UTC] 


# Linux SMP Console Edition ###################################################
###############################################################################

                       Folding@Home Client Version 6.29

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/craig/folding
Executable: ./fah6
Arguments: -smp 4 -advmethods -verbosity 9 

[15:05:27] - Ask before connecting: No
[15:05:27] - User name: goodyca (Team 439)
...
[05:58:14] Completed 495000 out of 500000 steps  (99%)
[06:04:12] Completed 500000 out of 500000 steps  (100%)
[06:04:12] DynamicWrapper: Finished Work Unit: sleep=10000
[06:04:22] 
[06:04:22] Finished Work Unit:
[06:04:22] - Reading up to 20446800 from "work/wudata_04.trr": Read 20446800
[06:04:22] trr file hash check passed.
[06:04:22] edr file hash check passed.
[06:04:22] logfile size: 56280
[06:04:22] Leaving Run
[06:04:24] - Writing 20537364 bytes of core data to disk...
[06:04:25]   ... Done.
[06:04:27] - Shutting down core
[06:04:27] 
[06:04:27] Folding@home Core Shutdown: FINISHED_UNIT
[06:04:27] CoreStatus = 64 (100)
[06:04:27] Unit 4 finished with 93 percent of time to deadline remaining.
[06:04:27] Updated performance fraction: 0.927901
[06:04:27] Sending work to server
[06:04:27] Project: 6012 (Run 2, Clone 418, Gen 45)


[06:04:27] + Attempting to send results [April 23 06:04:27 UTC]
[06:04:27] - Reading file work/wuresults_04.dat from core
[06:04:28]   (Read 20537364 bytes from disk)
[06:04:28] Connecting to http://130.237.232.140:8080/
[06:10:58] Posted data.
[06:10:58] Initial: 0000; - Uploaded at ~51 kB/s
[06:10:59] - Averaged speed for that direction ~51 kB/s
[06:10:59] + Results successfully sent
[06:10:59] Thank you for your contribution to Folding@Home.
[06:10:59] + Number of Units Completed: 84

[06:11:00] Trying to send all finished work units
[06:11:00] + No unsent completed units remaining.
[06:11:00] - Preparing to get new work unit...
[06:11:00] Cleaning up work directory
[06:11:01] + Attempting to get work packet
[06:11:01] Passkey found
[06:11:01] - Will indicate memory of 3954 MB
[06:11:01] - Connecting to assignment server
[06:11:01] Connecting to http://assign.stanford.edu:8080/
[06:11:01] Posted data.
[06:11:01] Initial: ED82; - Successful: assigned to (130.237.232.140).
[06:11:01] + News From Folding@Home: Welcome to Folding@Home
[06:11:01] Loaded queue successfully.
[06:11:01] Connecting to http://130.237.232.140:8080/
[06:11:06] Posted data.
[06:11:06] Initial: 0000; - Receiving payload (expected size: 978483)
[06:11:11] - Downloaded at ~191 kB/s
[06:11:11] - Averaged speed for that direction ~199 kB/s
[06:11:11] + Received work.
[06:11:11] Trying to send all finished work units
[06:11:11] + No unsent completed units remaining.
[06:11:11] + Closed connections
[06:11:11] 
[06:11:11] + Processing work unit
[06:11:11] Core required: FahCore_a3.exe
[06:11:11] Core found.
[06:11:11] Working on queue slot 05 [April 23 06:11:11 UTC]
[06:11:11] + Working ...
[06:11:11] - Calling './FahCore_a3.exe -dir work/ -nice 19 -suffix 05 -np 4 -checkpoint 15 -verbose -lifeline 2620 -version 629'

[06:11:11] 
[06:11:11] *------------------------------*
[06:11:11] Folding@Home Gromacs SMP Core
[06:11:11] Version 2.17 (March 6, 2010)
[06:11:11] 
[06:11:11] Preparing to commence simulation
[06:11:11] - Looking at optimizations...
[06:11:11] - Created dyn
[06:11:11] - Files status OK
[06:11:11] - Expanded 977971 -> 10427873 (decompressed 1066.2 percent)
[06:11:11] Called DecompressByteArray: compressed_data_size=977971 data_size=10427873, decompressed_data_size=10427873 diff=0
[06:11:11] - Digital signature verified
[06:11:11] 
[06:11:11] Project: 6013 (Run 0, Clone 78, Gen 91)
[06:11:11] 
[06:11:11] Assembly optimizations on if available.
[06:11:11] Entering M.D.
[06:11:32] Completed 0 out of 250000 steps  (0%)
[07:12:49] Completed 2500 out of 250000 steps  (1%)
[08:12:47] Completed 5000 out of 250000 steps  (2%)
[09:05:27] - Autosending finished units... [April 23 09:05:27 UTC]
[09:05:27] Trying to send all finished work units
[09:05:27] + No unsent completed units remaining.
[09:05:27] - Autosend completed
[09:12:55] Completed 7500 out of 250000 steps  (3%)
[10:12:49] Completed 10000 out of 250000 steps  (4%)
[11:12:42] Completed 12500 out of 250000 steps  (5%)
bollix47
Posts: 2976
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: Project: 6013 (Run 0, Clone 78, Gen 91) - will not meet dead

Post by bollix47 »

This problem has been reported before.

Sometimes simply restarting the client solves the problem. (i.e. ctrl-c to stop client, wait a few seconds to ensure the a3 core has stopped, you can use top or system monitor to check, start client). If you use top to display the processes type an H (shift h) to show the threads.

If the problem persists then remove queue.dat, unitinfo.txt and the contents of the work folder. If you keep getting the same WU you can also remove machinedependent.dat which will be recreated when you restart and the server shouldn't give you the same WU. When you remove that .dat file your active clients count will be increased by one but that should correct itself in seven days. :wink:
Image
goodyca
Posts: 187
Joined: Sun Dec 02, 2007 12:36 pm

Re: Project: 6013 (Run 0, Clone 78, Gen 91) - will not meet dead

Post by goodyca »

Deleting queue.dat, unitinfo.txt and the contents of the work folder, and restarting the client did not fix the problem. I also had to delete machinedependent.dat to get a different wu. Thanks for your help.
Post Reply