Page 1 of 1

Project: 11268 (Run 2, Clone 40, Gen 17)

Posted: Thu Nov 04, 2010 7:44 pm
by Robby_Firefox
FYI:

There appeared to be some trouble with this run of Project 11268. Every time I stopped/restarted the client, it came up with a fatal error and had to be closed.

Got rid of all files (moved them to the Recycle Bin) under the \work sub-directory and the unitinfo.txt file under the main directory Application Data\Folding@home-x86, then tried a restart. Seems that the it is working fine at the moment (the very same job was reloaded and is currently running), as noted in the second code snapshots below. We'll see how it runs. It should finish running in about 16 hours or so if nothing else occurs.

Thus a question on future occurrences of this, does one just delete what I removed, or should more files be removed in order for the client to work?

Thanks,
Robby / Team Firefox

AMD X2/4200 w/3 gig RAM on Windows XP Pro

Code: Select all

[14:29:50] Writing local files
[14:29:50] Completed 247500 out of 250000 steps  (99%)
[14:36:06] Writing local files
[14:36:06] Completed 250000 out of 250000 steps  (100%)
[14:36:06] Writing final coordinates.
[14:36:07] Past main M.D. loop
[14:36:48] + Working...
[14:37:07] 
[14:37:07] Finished Work Unit:
[14:37:07] - Reading up to 293544 from "work/wudata_08.arc": Read 293544
[14:37:07] - Reading up to 260692 from "work/wudata_08.xtc": Read 260692
[14:37:07] goefile size: 0
[14:37:07] logfile size: 16468
[14:37:07] Leaving Run
[14:37:08] - Writing 576612 bytes of core data to disk...
[14:37:08] Done: 576100 -> 560339 (compressed to 97.2 percent)
[14:37:08]   ... Done.
[14:37:08] - Shutting down core
[14:37:08] 
[14:37:08] Folding@home Core Shutdown: FINISHED_UNIT
[14:37:11] CoreStatus = 64 (100)
[14:37:11] Sending work to server
[14:37:11] Project: 11269 (Run 10, Clone 112, Gen 10)


[14:37:11] + Attempting to send results [November 4 14:37:11 UTC]
[14:37:14] + Results successfully sent
[14:37:14] Thank you for your contribution to Folding@Home.
[14:37:14] + Number of Units Completed: 314

[14:37:18] - Preparing to get new work unit...
[14:37:18] + Attempting to get work packet
[14:37:18] - Connecting to assignment server
[14:37:19] - Successful: assigned to (171.67.108.33).
[14:37:19] + News From Folding@Home: Welcome to Folding@Home
[14:37:19] Loaded queue successfully.
[14:37:21] + Closed connections
[14:37:21] 
[14:37:21] + Processing work unit
[14:37:21] Core required: FahCore_78.exe
[14:37:21] Core found.
[14:37:21] Working on queue slot 09 [November 4 14:37:21 UTC]
[14:37:21] + Working ...
[14:37:21] 
[14:37:21] *------------------------------*
[14:37:21] Folding@Home Gromacs Core
[14:37:21] Version 1.90 (March 8, 2006)
[14:37:21] 
[14:37:21] Preparing to commence simulation
[14:37:21] - Looking at optimizations...
[14:37:21] - Created dyn
[14:37:21] - Files status OK
[14:37:22] - Expanded 344206 -> 1737960 (decompressed 504.9 percent)
[14:37:22] - Starting from initial work packet
[14:37:22] 
[14:37:22] Project: 11268 (Run 2, Clone 40, Gen 17)
[14:37:22] 
[14:37:22] Assembly optimizations on if available.
[14:37:22] Entering M.D.
[14:37:28] Gromacs error.
[14:37:28] 
[14:37:28] Folding@home Core Shutdown: UNKNOWN_ERROR
[14:37:32] CoreStatus = 79 (121)
[14:37:32] Client-core communications error: ERROR 0x79
[14:37:32] This is a sign of more serious problems, shutting down.


--- Opening Log file [November 4 18:57:27 UTC] 


# Windows CPU Systray Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.23

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Documents and Settings\Robby\Application Data\Folding@home-x86


[18:57:27] - Ask before connecting: No
[18:57:27] - User name: Robby (Team 39299)
[18:57:27] - User ID: ****
[18:57:27] - Machine ID: 1
[18:57:27] 
[18:57:27] Loaded queue successfully.
[18:57:27] Initialization complete
[18:57:27] 
[18:57:27] + Processing work unit
[18:57:27] Core required: FahCore_78.exe
[18:57:27] Core found.
[18:57:27] Working on queue slot 09 [November 4 18:57:27 UTC]
[18:57:27] + Working ...
[18:57:27] 
[18:57:27] *------------------------------*
[18:57:27] Folding@Home Gromacs Core
[18:57:27] Version 1.90 (March 8, 2006)
[18:57:27] 
[18:57:27] Preparing to commence simulation
[18:57:27] - Ensuring status. Please wait.
[18:57:44] - Looking at optimizations...
[18:57:44] - Working with standard loops on this execution.
[18:57:44] - Created dyn
[18:57:44] - Files status OK
[18:57:45] - Expanded 344206 -> 1737960 (decompressed 504.9 percent)
[18:57:45] - Starting from initial work packet
[18:57:45] 
[18:57:45] Project: 11268 (Run 2, Clone 40, Gen 17)
[18:57:45] 
[18:57:45] Entering M.D.
[18:57:51] Gromacs error.
[18:57:51] 
[18:57:51] Folding@home Core Shutdown: UNKNOWN_ERROR
[18:57:53] CoreStatus = 79 (121)
[18:57:53] Client-core communications error: ERROR 0x79
[18:57:53] This is a sign of more serious problems, shutting down.

After deleting the above noted files...

Code: Select all

[19:06:29] This is a sign of more serious problems, shutting down.


--- Opening Log file [November 4 19:07:42 UTC] 


# Windows CPU Systray Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.23

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Documents and Settings\Robby\Application Data\Folding@home-x86


[19:07:42] - Ask before connecting: No
[19:07:42] - User name: Robby (Team 39299)
[19:07:42] - User ID: *****
[19:07:42] - Machine ID: 1
[19:07:42] 
[19:07:42] Loaded queue successfully.
[19:07:42] Initialization complete
[19:07:42] 
[19:07:42] + Processing work unit
[19:07:42] Core required: FahCore_78.exe
[19:07:42] Core found.
[19:07:42] Working on queue slot 09 [November 4 19:07:42 UTC]
[19:07:42] + Working ...
[19:07:42] 
[19:07:42] *------------------------------*
[19:07:42] Folding@Home Gromacs Core
[19:07:42] Version 1.90 (March 8, 2006)
[19:07:42] 
[19:07:42] Preparing to commence simulation
[19:07:42] - Looking at optimizations...
[19:07:42] - Created dyn
[19:07:42] - Files status OK
[19:07:42] 
[19:07:42] Folding@home Core Shutdown: MISSING_WORK_FILES
[19:07:46] CoreStatus = 74 (116)
[19:07:46] The core could not find the work files specified. Removing from queue
[19:07:46] Deleting current work unit & continuing...
[19:07:50] - Preparing to get new work unit...
[19:07:50] + Attempting to get work packet
[19:07:50] - Connecting to assignment server
[19:07:51] - Successful: assigned to (171.67.108.33).
[19:07:51] + News From Folding@Home: Welcome to Folding@Home
[19:07:51] Loaded queue successfully.
[19:07:54] + Closed connections
[19:07:59] 
[19:07:59] + Processing work unit
[19:07:59] Core required: FahCore_78.exe
[19:07:59] Core found.
[19:07:59] Working on queue slot 00 [November 4 19:07:59 UTC]
[19:07:59] + Working ...
[19:07:59] 
[19:07:59] *------------------------------*
[19:07:59] Folding@Home Gromacs Core
[19:07:59] Version 1.90 (March 8, 2006)
[19:07:59] 
[19:07:59] Preparing to commence simulation
[19:07:59] - Looking at optimizations...
[19:07:59] - Created dyn
[19:07:59] - Files status OK
[19:07:59] - Expanded 375288 -> 1804408 (decompressed 480.8 percent)
[19:07:59] - Starting from initial work packet
[19:07:59] 
[19:07:59] Project: 11268 (Run 2, Clone 40, Gen 17)
[19:07:59] 
[19:07:59] Assembly optimizations on if available.
[19:07:59] Entering M.D.
[19:08:06] Protein: ALZHEIMERS DISEASE AMYLOID
[19:08:06] 
[19:08:06] Writing local files
[19:08:49] Extra SSE boost OK.
[19:08:49] Writing local files
[19:08:49] Completed 0 out of 250000 steps  (0%)
[19:15:12] Writing local files
[19:15:12] Completed 2500 out of 250000 steps  (1%)
[19:21:33] Writing local files
[19:21:33] Completed 5000 out of 250000 steps  (2%)
[19:27:53] Writing local files
[19:27:53] Completed 7500 out of 250000 steps  (3%)


Re: Project: 11268 (Run 2, Clone 40, Gen 17)

Posted: Thu Nov 04, 2010 11:05 pm
by John_Weatherman
Others are reporting the same error - somethings wrong with these WUs.

Re: Project: 11268 (Run 2, Clone 40, Gen 17)

Posted: Fri Nov 05, 2010 2:15 am
by Robby_Firefox
Interesting. This one is still crunching, now at 66 percent complete.
Thanks for the heads up on these WUs.

- Robby

Re: Project: 11268 (Run 2, Clone 40, Gen 17)

Posted: Fri Nov 05, 2010 2:23 am
by bruce
Robby_Firefox wrote:Interesting. This one is still crunching, now at 66 percent complete.
Thanks for the heads up on these WUs.

- Robby
So apparently this has been fixed? I saw a report that it had been but we've still been getting reports of problems. I'm confused.

Re: Project: 11268 (Run 2, Clone 40, Gen 17)

Posted: Fri Nov 05, 2010 6:45 am
by Fireball0236
bruce, I think it is best to look at the time in the logs. My assumption is that all WUs from the 112xx Projects downloaded in a certain timespan yesterday became 'bad' for some reason. The first user posted it here: viewtopic.php?f=19&t=16581 and it was quickly fixed by yslin. But between the time the error happened, and the time it was fixed, all WUs went bad.

As long as the users try folding the bad WUs they downloaded earlier, the core will error out. They should remove their queue.dat and work folder and ask for a new WU. If they get the same WU they downloaded originally, it will fold without problems.

This can be seen from the log above, the bad WU was downloaded at:
[14:37:22] Project: 11268 (Run 2, Clone 40, Gen 17)

The user restarted the client later, and had the same problem:
[18:57:51] Folding@home Core Shutdown: UNKNOWN_ERROR

But after removing the related files, he was reassigned the same WU, redownloaded the files, and started folding without problems (2nd log file).


~ Fireball0236

Re: Project: 11268 (Run 2, Clone 40, Gen 17)

Posted: Fri Nov 05, 2010 7:24 pm
by bruce
Agreed. Topic closed.