Project: 6509 (Run 15, Clone 232, Gen 14)

Moderators: Site Moderators, FAHC Science Team

Post Reply
todh
Posts: 7
Joined: Wed Sep 08, 2010 1:30 pm

Project: 6509 (Run 15, Clone 232, Gen 14)

Post by todh »

I've got a machine that's been stuck on Project: 6509 (Run 15, Clone 232, Gen 14) for awhile now:

Code: Select all

[02:42:51] Writing local files
[02:42:51] Completed 15000 out of 250000 steps  (6%)
[02:57:52] Timered checkpoint triggered.
[03:09:04] CoreStatus = 0 (0)
[03:09:04] Sending work to server
[03:09:04] Project: 6509 (Run 15, Clone 232, Gen 14)
[03:09:04] - Error: Could not get length of results file work/wuresults_09.dat
[03:09:04] - Error: Could not read unit 09 file. Removing from queue.
[03:09:04] Trying to send all finished work units
[03:09:04] + No unsent completed units remaining.
[03:09:04] - Preparing to get new work unit...
[03:09:04] Cleaning up work directory
--
[05:54:47] Writing local files
[05:54:47] Completed 15000 out of 250000 steps  (6%)
[06:09:47] Timered checkpoint triggered.
[06:21:01] CoreStatus = 0 (0)
[06:21:01] Sending work to server
[06:21:01] Project: 6509 (Run 15, Clone 232, Gen 14)
[06:21:01] - Error: Could not get length of results file work/wuresults_00.dat
[06:21:01] - Error: Could not read unit 00 file. Removing from queue.
[06:21:01] Trying to send all finished work units
[06:21:01] + No unsent completed units remaining.
[06:21:01] - Preparing to get new work unit...
[06:21:01] Cleaning up work directory
--
[09:06:59] Writing local files
[09:06:59] Completed 15000 out of 250000 steps  (6%)
[09:21:59] Timered checkpoint triggered.
[09:33:14] CoreStatus = 0 (0)
[09:33:14] Sending work to server
[09:33:14] Project: 6509 (Run 15, Clone 232, Gen 14)
[09:33:14] - Error: Could not get length of results file work/wuresults_01.dat
[09:33:14] - Error: Could not read unit 01 file. Removing from queue.
[09:33:14] Trying to send all finished work units
[09:33:14] + No unsent completed units remaining.
[09:33:14] - Preparing to get new work unit...
[09:33:14] Cleaning up work directory
--
[12:18:42] Writing local files
[12:18:42] Completed 15000 out of 250000 steps  (6%)
[12:33:42] Timered checkpoint triggered.
[12:44:53] CoreStatus = 0 (0)
[12:44:53] Sending work to server
[12:44:53] Project: 6509 (Run 15, Clone 232, Gen 14)
[12:44:53] - Error: Could not get length of results file work/wuresults_02.dat
[12:44:53] - Error: Could not read unit 02 file. Removing from queue.
[12:44:53] Trying to send all finished work units
[12:44:53] + No unsent completed units remaining.
[12:44:53] - Preparing to get new work unit...
[12:44:53] Cleaning up work directory
todh
Posts: 7
Joined: Wed Sep 08, 2010 1:30 pm

Re: Project: 6509 (Run 15, Clone 232, Gen 14)

Post by todh »

Any word on this one? It's still failing at the same place, should I just delete it?

Code: Select all

[01:05:43] Completed 15000 out of 250000 steps  (6%)
[01:20:43] Timered checkpoint triggered.
[01:31:54] CoreStatus = 0 (0)
[01:31:54] Sending work to server
[01:31:54] Project: 6509 (Run 15, Clone 232, Gen 14)
[01:31:54] - Error: Could not get length of results file work/wuresults_06.dat
[01:31:54] - Error: Could not read unit 06 file. Removing from queue.
[01:31:54] Trying to send all finished work units
[01:31:54] + No unsent completed units remaining.
[01:31:54] - Preparing to get new work unit...
[01:31:54] Cleaning up work directory
John_Weatherman
Posts: 289
Joined: Sun Dec 02, 2007 4:31 am
Location: Carrizo Plain National Monument, California
Contact:

Re: Project: 6509 (Run 15, Clone 232, Gen 14)

Post by John_Weatherman »

todh wrote:Any word on this one? It's still failing at the same place, should I just delete it?
Yep. You might have to delete the queue.dat too to get a new WU.
todh
Posts: 7
Joined: Wed Sep 08, 2010 1:30 pm

Re: Project: 6509 (Run 15, Clone 232, Gen 14)

Post by todh »

John_Weatherman wrote: Yep. You might have to delete the queue.dat too to get a new WU.
Mmm, apparently that's not enough, it came back:

Code: Select all

[12:26:56] Loaded queue successfully.
[12:26:56] Printing Queue Information
Current Queue: 
Slot 01  Empty/Deleted

Slot 02  Empty/Deleted

Slot 03  Empty/Deleted

Slot 04  Empty/Deleted

Slot 05  Empty/Deleted

Slot 06  Empty/Deleted

Slot 07  Empty/Deleted

Slot 08  Empty/Deleted

Slot 09  Empty/Deleted

Slot 00 *Empty/Deleted

PF: 0.000000 based on last 0 slot(s)

Folding@Home Client Shutdown.


--- Opening Log file [September 29 12:29:01 UTC] 


# Linux Console Edition #######################################################
###############################################################################

                       Folding@Home Client Version 6.29

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/fah/hitchcock
Executable: /home/fah/bin/fah6
Arguments: -verbosity 9 

[12:29:01] - Ask before connecting: No
[12:29:01] - User name: bondcliff (Team 163)
[12:29:01] - User ID: 5B39A2904F168873
[12:29:01] - Machine ID: 4
[12:29:01] 
[12:29:01] Loaded queue successfully.
[12:29:01] - Preparing to get new work unit...
[12:29:01] Cleaning up work directory
[12:29:01] - Autosending finished units... [September 29 12:29:01 UTC]
[12:29:01] Trying to send all finished work units
[12:29:01] + No unsent completed units remaining.
[12:29:01] - Autosend completed
[12:29:01] + Attempting to get work packet
[12:29:01] - Will indicate memory of 983 MB
[12:29:01] - Connecting to assignment server
[12:29:01] Connecting to http://assign.stanford.edu:8080/
[12:29:01] Posted data.
[12:29:01] Initial: 40AB; - Successful: assigned to (171.64.65.62).
[12:29:01] + News From Folding@Home: Welcome to Folding@Home
[12:29:02] Loaded queue successfully.
[12:29:02] Connecting to http://171.64.65.62:8080/
[12:29:03] Posted data.
[12:29:03] Initial: 0000; - Receiving payload (expected size: 997268)
[12:29:04] - Downloaded at ~973 kB/s
[12:29:04] - Averaged speed for that direction ~973 kB/s
[12:29:04] + Received work.
[12:29:04] + Closed connections
[12:29:04] 
[12:29:04] + Processing work unit
[12:29:04] Core required: FahCore_78.exe
[12:29:04] Core found.
[12:29:04] Working on queue slot 01 [September 29 12:29:04 UTC]
[12:29:04] + Working ...
[12:29:04] - Calling './FahCore_78.exe -dir work/ -nice 19 -suffix 01 -np 0 -checkpoint 15 -verbose -lifeline 3805 -version 629'

[12:29:04] 
[12:29:04] *------------------------------*
[12:29:04] Folding@Home Gromacs Core
[12:29:04] Version 1.90 (March 8, 2006)
[12:29:04] 
[12:29:04] Preparing to commence simulation
[12:29:04] - Looking at optimizations...
[12:29:04] - Created dyn
[12:29:04] - Files status OK
[12:29:05] - Expanded 996756 -> 5048061 (decompressed 506.4 percent)
[12:29:05] - Starting from initial work packet
[12:29:05] 
[12:29:05] Project: 6509 (Run 15, Clone 232, Gen 14)
[12:29:05] 
[12:29:05] Assembly optimizations on if available.
[12:29:05] Entering M.D.
[12:29:11] Protein: TR574_16 in water
[12:29:11] 
[12:29:11] Writing local files
[12:29:11] Extra SSE boost OK.
[12:29:12] Writing local files
[12:29:12] Completed 0 out of 250000 steps  (0%)
[12:44:13] Timered checkpoint triggered.
I've shut down the client for the time being.
John_Weatherman
Posts: 289
Joined: Sun Dec 02, 2007 4:31 am
Location: Carrizo Plain National Monument, California
Contact:

Re: Project: 6509 (Run 15, Clone 232, Gen 14)

Post by John_Weatherman »

OK, try deleting the work folder. If that does n't work try changing the machine id. Hopefully one of those will do the trick.
In the meanwhile the Mods could hopefully pass on to Stanford that this is a duff WU.
todh
Posts: 7
Joined: Wed Sep 08, 2010 1:30 pm

Re: Project: 6509 (Run 15, Clone 232, Gen 14)

Post by todh »

John_Weatherman wrote:OK, try deleting the work folder. If that does n't work try changing the machine id. Hopefully one of those will do the trick.
In the meanwhile the Mods could hopefully pass on to Stanford that this is a duff WU.
Wow, it took changing the machineid to get a new WU, deleting the work folder and queue.dat wasn't enough. Thanks.
sortofageek
Site Admin
Posts: 3110
Joined: Fri Nov 30, 2007 8:06 pm
Location: Team Helix
Contact:

Re: Project: 6509 (Run 15, Clone 232, Gen 14)

Post by sortofageek »

For the record, Project: 6509 (Run 15, Clone 232, Gen 14) was added to the stats database on 2010-10-09 07:08:18 for 192 points of credit.
Post Reply