Project: 2682 (Run 5, Clone 7, Gen 7)

Moderators: Site Moderators, FAHC Science Team

Post Reply
bollix47
Posts: 2976
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Project: 2682 (Run 5, Clone 7, Gen 7)

Post by bollix47 »

Code: Select all

[22:35:16] Completed 195000 out of 250000 steps  (78%)
[23:16:15] Completed 197500 out of 250000 steps  (79%)

t = 7791.756 ps: Water molecule starting at atom 831009 can not be settled.
Check for bad contacts and/or reduce the timestep.
[23:23:23] 
[23:23:23] Folding@home Core Shutdown: INTERRUPTED
application called MPI_Abort(MPI_COMM_WORLD, 102) - process 0
[0]0:Return code = 102
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Quit
[0]3:Return code = 0, signaled with Quit
[0]4:Return code = 0, signaled with Quit
[0]5:Return code = 0, signaled with Segmentation fault
[0]6:Return code = 0, signaled with Segmentation fault
[0]7:Return code = 0, signaled with Quit
[23:23:31] CoreStatus = 66 (102)
[23:23:31] + Shutdown requested by user. Exiting.***** Got a SIGTERM signal (15)
[23:23:31] Killing all core threads
I did nothing to stop the client .... this happened automatically.
I started up the WU again and it carried on until 86% where it died with CoreStatus 0 and deleted the WU. :e?:

Code: Select all

[06:35:18] Completed 215000 out of 250000 steps  (86%)
[06:55:00] CoreStatus = 0 (0)
[06:55:00] Sending work to server
[06:55:00] Project: 2682 (Run 5, Clone 7, Gen 7)
[06:55:00] - Error: Could not get length of results file work/wuresults_04.dat
[06:55:00] - Error: Could not read unit 04 file. Removing from queue.
[06:55:00] Trying to send all finished work units
[06:55:00] + No unsent completed units remaining.
[06:55:00] - Preparing to get new work unit...
[06:55:00] Cleaning up work directory
The client then tried to do the same WU again but I deleted it. Almost 3 days of work with no results and didn't want to chance the same thing happening again.
Image
Wrish
Posts: 74
Joined: Thu Jan 28, 2010 5:09 am

Re: Project: 2682 (Run 5, Clone 7, Gen 7)

Post by Wrish »

Segmentation fault is almost guaranteed to be a memory error; time to do memtest86. If this is one of those i7's, know that the uncore voltage also powers portions of RAM, aside from vDIMM. One or both is too low, or memory timings are too aggressive.
toTOW
Site Moderator
Posts: 6435
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project: 2682 (Run 5, Clone 7, Gen 7)

Post by toTOW »

I thinks it's a bit strange to see such a low Gen for a projects that has been around for a while ... in my opinion, the odds are high that's a bad WU ...
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
bollix47
Posts: 2976
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: Project: 2682 (Run 5, Clone 7, Gen 7)

Post by bollix47 »

Wrish wrote:Segmentation fault is almost guaranteed to be a memory error; time to do memtest86. If this is one of those i7's, know that the uncore voltage also powers portions of RAM, aside from vDIMM. One or both is too low, or memory timings are too aggressive.
This was indeed an i7@3.2. I didn't touch the memory timings or voltages. Only increased the bus speed, mildly. Never any stability problems but we all know that fah is an ultimate test. I will run memtest at some point. Having just lost the science and ~50,000 points I'm not in a hurry to shut it down again any time soon but I do realize the importance of running it. :wink:

It has been running the bigadv since their beginning and has completed 279 WUs (probably close to 100 bigadv) according to the local count.

Hopefully, toTOW, you are correct and there's nothing wrong with this computer's setup.
Image
Post Reply