Project: 6509 (Run 4, Clone 309, Gen 19)

Moderators: Site Moderators, FAHC Science Team

Post Reply
ejs
Posts: 6
Joined: Thu Nov 04, 2010 3:01 pm

Project: 6509 (Run 4, Clone 309, Gen 19)

Post by ejs »

I'm another person on Linux with a segfaulting project 6509:

This is the log from the first try:

Code: Select all

# Linux Console Edition #######################################################
###############################################################################

                       Folding@Home Client Version 6.02

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/ejs/folding-2
Executable: ./fah6
Arguments: -verbosity 9 

[07:42:49] - Ask before connecting: No
[07:42:49] - User name: Edward_J_Sheldrake (Team 11688)
[07:42:49] - User ID: 72D574772674AEC9
[07:42:49] - Machine ID: 2
[07:42:49] 
[07:42:49] Loaded queue successfully.
[07:42:49] 
[07:42:49] - Autosending finished units...
[07:42:49] + Processing work unit
[07:42:49] Trying to send all finished work units
[07:42:49] Core required: FahCore_78.exe
[07:42:49] + No unsent completed units remaining.
[07:42:49] - Autosend completed
[07:42:49] Core found.
[07:42:49] Working on Unit 04 [November 7 07:42:49]
[07:42:49] + Working ...
[07:42:49] - Calling './FahCore_78.exe -dir work/ -suffix 04 -checkpoint 15 -verbose -lifeline 1601 -version 602'

[07:42:49] 
[07:42:49] *------------------------------*
[07:42:49] Folding@Home Gromacs Core
[07:42:49] Version 1.90 (March 8, 2006)
[07:42:49] 
[07:42:49] Preparing to commence simulation
[07:42:49] - Looking at optimizations...
[07:42:49] - Files status OK
[07:42:50] - Expanded 996150 -> 5044825 (decompressed 506.4 percent)
[07:42:50] 
[07:42:50] Project: 6509 (Run 4, Clone 309, Gen 19)
[07:42:50] 
[07:42:50] Assembly optimizations on if available.
[07:42:50] Entering M.D.
[07:43:11] (Starting from checkpoint)
[07:43:11] Protein: TR574_5 in water
[07:43:11] 
[07:43:11] Writing local files
[07:43:11] Completed 42500 out of 250000 steps  (17%)
[07:43:11] Extra SSE boost OK.
[07:56:49] Writing local files
[07:56:49] Completed 45000 out of 250000 steps  (18%)
[08:10:15] Writing local files
[08:10:15] Completed 47500 out of 250000 steps  (19%)
[08:24:27] Writing local files
[08:24:27] Completed 50000 out of 250000 steps  (20%)
[08:38:51] Writing local files
[08:38:52] Completed 52500 out of 250000 steps  (21%)
[08:52:59] Writing local files
[08:52:59] Completed 55000 out of 250000 steps  (22%)
[09:07:59] Timered checkpoint triggered.
[09:08:32] Writing local files
[09:08:32] Completed 57500 out of 250000 steps  (23%)
[09:23:32] Timered checkpoint triggered.
[09:23:49] Writing local files
[09:23:49] Completed 60000 out of 250000 steps  (24%)
[09:37:46] Writing local files
[09:37:46] Completed 62500 out of 250000 steps  (25%)
[09:51:45] Writing local files
[09:51:45] Completed 65000 out of 250000 steps  (26%)
[10:06:00] Writing local files
[10:06:00] Completed 67500 out of 250000 steps  (27%)
[10:20:35] Writing local files
[10:20:35] Completed 70000 out of 250000 steps  (28%)
[10:35:35] Timered checkpoint triggered.
[10:36:02] Writing local files
[10:36:02] Completed 72500 out of 250000 steps  (29%)
[10:50:16] Writing local files
[10:50:16] Completed 75000 out of 250000 steps  (30%)
[11:04:18] Writing local files
[11:04:18] Completed 77500 out of 250000 steps  (31%)
[11:18:24] Writing local files
[11:18:24] Completed 80000 out of 250000 steps  (32%)
[11:32:35] Writing local files
[11:32:35] Completed 82500 out of 250000 steps  (33%)
[11:46:47] Writing local files
[11:46:47] Completed 85000 out of 250000 steps  (34%)
[12:01:22] Writing local files
[12:01:22] Completed 87500 out of 250000 steps  (35%)
[12:14:56] Writing local files
[12:14:56] Completed 90000 out of 250000 steps  (36%)
[12:28:28] Writing local files
[12:28:28] Completed 92500 out of 250000 steps  (37%)
[12:42:23] Writing local files
[12:42:24] Completed 95000 out of 250000 steps  (38%)
[12:56:26] Writing local files
[12:56:26] Completed 97500 out of 250000 steps  (39%)
[13:10:49] Writing local files
[13:10:49] Completed 100000 out of 250000 steps  (40%)
[13:24:50] Writing local files
[13:24:50] Completed 102500 out of 250000 steps  (41%)
[13:39:34] Writing local files
[13:39:34] Completed 105000 out of 250000 steps  (42%)
[13:42:49] - Autosending finished units...
[13:42:49] Trying to send all finished work units
[13:42:49] + No unsent completed units remaining.
[13:42:49] - Autosend completed
[13:54:29] Writing local files
[13:54:30] Completed 107500 out of 250000 steps  (43%)
[14:09:03] Writing local files
[14:09:03] Completed 110000 out of 250000 steps  (44%)
[14:23:36] Writing local files
[14:23:36] Completed 112500 out of 250000 steps  (45%)
[14:38:19] Writing local files
[14:38:19] Completed 115000 out of 250000 steps  (46%)
[14:53:19] Timered checkpoint triggered.
[14:53:27] Writing local files
[14:53:27] Completed 117500 out of 250000 steps  (47%)
[15:07:58] Writing local files
[15:07:58] Completed 120000 out of 250000 steps  (48%)
[15:22:03] Writing local files
[15:22:03] Completed 122500 out of 250000 steps  (49%)
[15:35:33] Writing local files
[15:35:33] Completed 125000 out of 250000 steps  (50%)
[15:49:03] Writing local files
[15:49:03] Completed 127500 out of 250000 steps  (51%)
[16:02:33] Writing local files
[16:02:33] Completed 130000 out of 250000 steps  (52%)
[16:16:02] Writing local files
[16:16:02] Completed 132500 out of 250000 steps  (53%)
[16:29:30] Writing local files
[16:29:30] Completed 135000 out of 250000 steps  (54%)
[16:43:39] Writing local files
[16:43:39] Completed 137500 out of 250000 steps  (55%)
[16:57:37] Writing local files
[16:57:37] Completed 140000 out of 250000 steps  (56%)
[17:11:09] Writing local files
[17:11:09] Completed 142500 out of 250000 steps  (57%)
[17:25:14] Writing local files
[17:25:14] Completed 145000 out of 250000 steps  (58%)
[17:39:50] Writing local files
[17:39:50] Completed 147500 out of 250000 steps  (59%)
[17:54:33] Writing local files
[17:54:33] Completed 150000 out of 250000 steps  (60%)
[18:08:55] Writing local files
[18:08:55] Completed 152500 out of 250000 steps  (61%)
[18:22:46] Writing local files
[18:22:46] Completed 155000 out of 250000 steps  (62%)
[18:37:31] Writing local files
[18:37:32] Completed 157500 out of 250000 steps  (63%)
[18:48:57] CoreStatus = 0 (0)
[18:48:57] Client-core communications error: ERROR 0x0
[18:48:57] Deleting current work unit & continuing...
[18:49:15] Trying to send all finished work units
[18:49:15] + No unsent completed units remaining.
[18:49:15] - Preparing to get new work unit...
[18:49:15] + Attempting to get work packet
The next time, it segfaulted after 55%, currently, I've made a tar file of the directory, it's got to 57% this time, after which it segfaults.
ejs
Posts: 6
Joined: Thu Nov 04, 2010 3:01 pm

Re: Project: 6509 (Run 4, Clone 309, Gen 19)

Post by ejs »

I've moved this unit over to my other computer. The 57% transferred across segfaulted before completing another 1%. I've started it again from the beginning on this machine, but this computer is much slower, taking at least 33 minutes per 1%, so it will take many days, depending on if/when it crashes.
ejs
Posts: 6
Joined: Thu Nov 04, 2010 3:01 pm

Re: Project: 6509 (Run 4, Clone 309, Gen 19)

Post by ejs »

Starting from 0% on my older computer (a 2.0 GHz pentium4, 32bit only), the core segfaulted after 59%

Code: Select all

--- Opening Log file [November 12 06:38:32] 


# Linux Console Edition #######################################################
###############################################################################

                       Folding@Home Client Version 6.02

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/ejs/folding-2
Executable: ./fah6
Arguments: -verbosity 9 -oneunit 

[06:38:32] - Ask before connecting: No
[06:38:32] - User name: Edward_J_Sheldrake (Team 11688)
[06:38:32] - User ID: 72D574772674AEC9
[06:38:32] - Machine ID: 2
[06:38:32] 
[06:38:32] Loaded queue successfully.
[06:38:32] 
[06:38:32] + Processing work unit
[06:38:32] Core required: FahCore_78.exe
[06:38:32] Core found.
[06:38:32] - Autosending finished units...
[06:38:32] Trying to send all finished work units
[06:38:32] + No unsent completed units remaining.
[06:38:32] - Autosend completed
[06:38:32] Working on Unit 07 [November 12 06:38:32]
[06:38:32] + Working ...
[06:38:32] - Calling './FahCore_78.exe -dir work/ -suffix 07 -checkpoint 15 -verbose -lifeline 1295 -version 602'

[06:38:32] 
[06:38:32] *------------------------------*
[06:38:32] Folding@Home Gromacs Core
[06:38:32] Version 1.90 (March 8, 2006)
[06:38:32] 
[06:38:32] Preparing to commence simulation
[06:38:32] - Looking at optimizations...
[06:38:32] - Files status OK
[06:38:33] - Expanded 996150 -> 5044825 (decompressed 506.4 percent)
[06:38:33] 
[06:38:33] Project: 6509 (Run 4, Clone 309, Gen 19)
[06:38:33] 
[06:38:33] Assembly optimizations on if available.
[06:38:33] Entering M.D.
[06:38:54] (Starting from checkpoint)
[06:38:54] Protein: TR574_5 in water
[06:38:54] 
[06:38:54] Writing local files
[06:38:54] Completed 135000 out of 250000 steps  (54%)
[06:38:55] Extra SSE boost OK.
[06:53:55] Timered checkpoint triggered.
[07:08:55] Timered checkpoint triggered.
[07:10:45] Writing local files
[07:10:45] Completed 137500 out of 250000 steps  (55%)
[07:25:46] Timered checkpoint triggered.
[07:40:45] Timered checkpoint triggered.
[07:43:10] Writing local files
[07:43:10] Completed 140000 out of 250000 steps  (56%)
[07:58:10] Timered checkpoint triggered.
[08:13:11] Timered checkpoint triggered.
[08:19:02] Writing local files
[08:19:02] Completed 142500 out of 250000 steps  (57%)
[08:34:02] Timered checkpoint triggered.
[08:49:03] Timered checkpoint triggered.
[08:51:46] Writing local files
[08:51:46] Completed 145000 out of 250000 steps  (58%)
[09:06:50] Timered checkpoint triggered.
[09:21:51] Timered checkpoint triggered.
[09:26:12] Writing local files
[09:26:12] Completed 147500 out of 250000 steps  (59%)
[09:41:13] Timered checkpoint triggered.
[09:51:59] CoreStatus = 0 (0)
[09:51:59] Client-core communications error: ERROR 0x0
[09:51:59] Deleting current work unit & continuing...
The segfault is not detected by the folding@home software but appears in the kernel log:

FahCore_78.exe[1301]: segfault at 4ff1e054 ip 08087a2f sp bf7fe3dc error 4 in FahCore_78.exe[8048000+322000]

This machine is so much older, that it then got assigned a protomol unit instead. I've still got a copy of the 57% and a core dump file if they are wanted.
sortofageek
Site Admin
Posts: 3110
Joined: Fri Nov 30, 2007 8:06 pm
Location: Team Helix
Contact:

Re: Project: 6509 (Run 4, Clone 309, Gen 19)

Post by sortofageek »

No data back yet on Project: 6509 (Run 4, Clone 309, Gen 19). Thanks for your report. We'll keep an eye on the database.
sortofageek
Site Admin
Posts: 3110
Joined: Fri Nov 30, 2007 8:06 pm
Location: Team Helix
Contact:

Re: Project: 6509 (Run 4, Clone 309, Gen 19)

Post by sortofageek »

Time has passed, one other report, an EUE. The WU (P6509,R4,C309,G19) has been reported as a bad WU.
Post Reply