Project: 2671 (Run 11, Clone 32, Gen 9) seg fault

Moderators: Site Moderators, FAHC Science Team

Post Reply
alpha754293
Posts: 383
Joined: Sun Jan 18, 2009 1:13 am

Project: 2671 (Run 11, Clone 32, Gen 9) seg fault

Post by alpha754293 »

console:

Code: Select all

[08:04:10] - Ask before connecting: No
[08:04:10] - User name: alpha754293 (Team 596)
[08:04:10] - User ID: 47FBD1D4056DB49E
[08:04:10] - Machine ID: 1
[08:04:10]
[08:04:10] Loaded queue successfully.
[08:04:10]
[08:04:10] - Autosending finished units... [April 17 08:04:10 UTC]
[08:04:10] + Processing work unit
[08:04:10] Trying to send all finished work units
[08:04:10] Core required: FahCore_a2.exe
[08:04:10] + No unsent completed units remaining.
[08:04:10] Core found.
[08:04:10] - Autosend completed
[08:04:10] Working on queue slot 04 [April 17 08:04:10 UTC]
[08:04:10] + Working ...
[08:04:10] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work
/ -suffix 04 -checkpoint 15 -verbose -lifeline 13275 -version 624'

[08:04:10]
[08:04:10] *------------------------------*
[08:04:10] Folding@Home Gromacs SMP Core
[08:04:10] Version 2.06 (Tue Mar 31 08:29:45 PDT 2009)
[08:04:10]
[08:04:10] Preparing to commence simulation
[08:04:10] - Ensuring status. Please wait.
[08:04:10] Files status OK
[08:04:11] - Expanded 4836370 -> 24038369 (decompressed 497.0 percent)
[08:04:11] Called DecompressByteArray: compressed_data_size=4836370 data_size=24
038369, decompressed_data_size=24038369 diff=0
[08:04:11] - Digital signature verified
[08:04:11]
[08:04:11] Project: 2671 (Run 11, Clone 32, Gen 9)
[08:04:11]
[08:04:12] Assembly optimizations on if available.
[08:04:12] Entering M.D.
[08:04:18] Using Gromacs checkpoints
[08:04:21]
[08:04:21] Entering M.D.
[08:04:27] Using Gromacs checkpoints
NNODES=4, MYRANK=1, HOSTNAME=computenode
NNODES=4, MYRANK=2, HOSTNAME=computenode
NNODES=4, MYRANK=0, HOSTNAME=computenode
NNODES=4, MYRANK=3, HOSTNAME=computenode
NODEID=0 argc=23
NODEID=1 argc=23
NODEID=2 argc=23
NODEID=3 argc=23
                         :-)  G  R  O  M  A  C  S  (-:

                   Groningen Machine for Chemical Simulation

                 :-)  VERSION 4.0.99_development_20090307  (-:


      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2008, The GROMACS development team,
            check out http://www.gromacs.org for more information.


                                :-)  mdrun  (-:

Reading file work/wudata_04.tpr, VERSION 3.3.99_development_20070618 (single pre
cision)
Note: tpx file_version 48, software version 64

Reading checkpoint file work/wudata_04.cpt generated: Thu Apr 16 18:26:39 2009


NOTE: The tpr file used for this simulation is in an old format, for less memory
 usage and possibly more performance create a new tpr file with an up to date ve
rsion of grompp

Making 1D domain decomposition 1 x 1 x 4
starting mdrun '22884 system in water'
2500000 steps,   5000.0 ps (continuing from step 2347510,   4695.0 ps).
[08:04:31] data_04.log
[08:04:31] Verified work/wudata_04.trr
[08:04:31] Verified work/wudata_04.xtc
[08:04:31] Verified work/wudata_04.edr
[08:04:31] Completed 97510 out of 250000 steps  (39%)
[08:18:04] Completed 100000 out of 250000 steps  (40%)
[08:31:55] Completed 102500 out of 250000 steps  (41%)
[08:45:45] Completed 105000 out of 250000 steps  (42%)
[08:59:35] Completed 107500 out of 250000 steps  (43%)
[09:13:26] Completed 110000 out of 250000 steps  (44%)
[09:27:15] Completed 112500 out of 250000 steps  (45%)
[09:41:08] Completed 115000 out of 250000 steps  (46%)
[09:55:02] Completed 117500 out of 250000 steps  (47%)
[10:08:52] Completed 120000 out of 250000 steps  (48%)
[10:22:42] Completed 122500 out of 250000 steps  (49%)
[10:36:30] Completed 125000 out of 250000 steps  (50%)
[10:50:21] Completed 127500 out of 250000 steps  (51%)
[11:04:10] Completed 130000 out of 250000 steps  (52%)
[11:18:03] Completed 132500 out of 250000 steps  (53%)
[11:31:57] Completed 135000 out of 250000 steps  (54%)
[11:45:42] Completed 137500 out of 250000 steps  (55%)
[11:59:30] Completed 140000 out of 250000 steps  (56%)
[12:13:26] Completed 142500 out of 250000 steps  (57%)
[12:27:19] Completed 145000 out of 250000 steps  (58%)
[12:41:09] Completed 147500 out of 250000 steps  (59%)
[12:54:58] Completed 150000 out of 250000 steps  (60%)
[13:08:48] Completed 152500 out of 250000 steps  (61%)
[13:22:39] Completed 155000 out of 250000 steps  (62%)
[13:36:33] Completed 157500 out of 250000 steps  (63%)
[13:50:27] Completed 160000 out of 250000 steps  (64%)
[14:04:10] - Autosending finished units... [April 17 14:04:10 UTC]
[14:04:10] Trying to send all finished work units
[14:04:10] + No unsent completed units remaining.
[14:04:10] - Autosend completed
[14:04:18] Completed 162500 out of 250000 steps  (65%)
[14:18:10] Completed 165000 out of 250000 steps  (66%)
[14:32:04] Completed 167500 out of 250000 steps  (67%)
[14:45:55] Completed 170000 out of 250000 steps  (68%)
[14:59:43] Completed 172500 out of 250000 steps  (69%)
[15:13:34] Completed 175000 out of 250000 steps  (70%)
[15:27:24] Completed 177500 out of 250000 steps  (71%)
[15:41:14] Completed 180000 out of 250000 steps  (72%)
[15:55:04] Completed 182500 out of 250000 steps  (73%)
[16:08:52] Completed 185000 out of 250000 steps  (74%)
[16:22:41] Completed 187500 out of 250000 steps  (75%)
[16:36:30] Completed 190000 out of 250000 steps  (76%)
[16:50:24] Completed 192500 out of 250000 steps  (77%)
[17:01:35] Completed 195000 out of 250000 steps  (78%)
[17:10:25] Completed 197500 out of 250000 steps  (79%)
[17:19:15] Completed 200000 out of 250000 steps  (80%)
[17:28:04] Completed 202500 out of 250000 steps  (81%)
[17:36:54] Completed 205000 out of 250000 steps  (82%)
[17:45:44] Completed 207500 out of 250000 steps  (83%)
[17:54:33] Completed 210000 out of 250000 steps  (84%)
[18:03:22] Completed 212500 out of 250000 steps  (85%)
[18:07:33]
[18:07:33] Folding@home Core Shutdown: INTERRUPTED
[cli_2]: aborting job:
Fatal error in MPI_Sendrecv: Error message texts are not available
[cli_0]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 102) - process 0
[0]0:Return code = 102
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 1
[0]3:Return code = 0, signaled with Segmentation fault
[18:07:37] CoreStatus = 66 (102)
[18:07:37] + Shutdown requested by user. Exiting.***** Got a SIGTERM signal (15)
[18:07:37] Killing all core threads

Folding@Home Client Shutdown.
FAHlog:

Code: Select all

[08:04:10] - Ask before connecting: No
[08:04:10] - User name: alpha754293 (Team 596)
[08:04:10] - User ID: 47FBD1D4056DB49E
[08:04:10] - Machine ID: 1
[08:04:10] 
[08:04:10] Loaded queue successfully.
[08:04:10] 
[08:04:10] - Autosending finished units... [April 17 08:04:10 UTC]
[08:04:10] + Processing work unit
[08:04:10] Trying to send all finished work units
[08:04:10] Core required: FahCore_a2.exe
[08:04:10] + No unsent completed units remaining.
[08:04:10] Core found.
[08:04:10] - Autosend completed
[08:04:10] Working on queue slot 04 [April 17 08:04:10 UTC]
[08:04:10] + Working ...
[08:04:10] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 04 -checkpoint 15 -verbose -lifeline 13275 -version 624'

[08:04:10] 
[08:04:10] *------------------------------*
[08:04:10] Folding@Home Gromacs SMP Core
[08:04:10] Version 2.06 (Tue Mar 31 08:29:45 PDT 2009)
[08:04:10] 
[08:04:10] Preparing to commence simulation
[08:04:10] - Ensuring status. Please wait.
[08:04:10] Files status OK
[08:04:11] - Expanded 4836370 -> 24038369 (decompressed 497.0 percent)
[08:04:11] Called DecompressByteArray: compressed_data_size=4836370 data_size=24038369, decompressed_data_size=24038369 diff=0
[08:04:11] - Digital signature verified
[08:04:11] 
[08:04:11] Project: 2671 (Run 11, Clone 32, Gen 9)
[08:04:11] 
[08:04:12] Assembly optimizations on if available.
[08:04:12] Entering M.D.
[08:04:18] Using Gromacs checkpoints
[08:04:21] 
[08:04:21] Entering M.D.
[08:04:27] Using Gromacs checkpoints
[08:04:31] data_04.log
[08:04:31] Verified work/wudata_04.trr
[08:04:31] Verified work/wudata_04.xtc
[08:04:31] Verified work/wudata_04.edr
[08:04:31] Completed 97510 out of 250000 steps  (39%)
[08:18:04] Completed 100000 out of 250000 steps  (40%)
[08:31:55] Completed 102500 out of 250000 steps  (41%)
[08:45:45] Completed 105000 out of 250000 steps  (42%)
[08:59:35] Completed 107500 out of 250000 steps  (43%)
[09:13:26] Completed 110000 out of 250000 steps  (44%)
[09:27:15] Completed 112500 out of 250000 steps  (45%)
[09:41:08] Completed 115000 out of 250000 steps  (46%)
[09:55:02] Completed 117500 out of 250000 steps  (47%)
[10:08:52] Completed 120000 out of 250000 steps  (48%)
[10:22:42] Completed 122500 out of 250000 steps  (49%)
[10:36:30] Completed 125000 out of 250000 steps  (50%)
[10:50:21] Completed 127500 out of 250000 steps  (51%)
[11:04:10] Completed 130000 out of 250000 steps  (52%)
[11:18:03] Completed 132500 out of 250000 steps  (53%)
[11:31:57] Completed 135000 out of 250000 steps  (54%)
[11:45:42] Completed 137500 out of 250000 steps  (55%)
[11:59:30] Completed 140000 out of 250000 steps  (56%)
[12:13:26] Completed 142500 out of 250000 steps  (57%)
[12:27:19] Completed 145000 out of 250000 steps  (58%)
[12:41:09] Completed 147500 out of 250000 steps  (59%)
[12:54:58] Completed 150000 out of 250000 steps  (60%)
[13:08:48] Completed 152500 out of 250000 steps  (61%)
[13:22:39] Completed 155000 out of 250000 steps  (62%)
[13:36:33] Completed 157500 out of 250000 steps  (63%)
[13:50:27] Completed 160000 out of 250000 steps  (64%)
[14:04:10] - Autosending finished units... [April 17 14:04:10 UTC]
[14:04:10] Trying to send all finished work units
[14:04:10] + No unsent completed units remaining.
[14:04:10] - Autosend completed
[14:04:18] Completed 162500 out of 250000 steps  (65%)
[14:18:10] Completed 165000 out of 250000 steps  (66%)
[14:32:04] Completed 167500 out of 250000 steps  (67%)
[14:45:55] Completed 170000 out of 250000 steps  (68%)
[14:59:43] Completed 172500 out of 250000 steps  (69%)
[15:13:34] Completed 175000 out of 250000 steps  (70%)
[15:27:24] Completed 177500 out of 250000 steps  (71%)
[15:41:14] Completed 180000 out of 250000 steps  (72%)
[15:55:04] Completed 182500 out of 250000 steps  (73%)
[16:08:52] Completed 185000 out of 250000 steps  (74%)
[16:22:41] Completed 187500 out of 250000 steps  (75%)
[16:36:30] Completed 190000 out of 250000 steps  (76%)
[16:50:24] Completed 192500 out of 250000 steps  (77%)
[17:01:35] Completed 195000 out of 250000 steps  (78%)
[17:10:25] Completed 197500 out of 250000 steps  (79%)
[17:19:15] Completed 200000 out of 250000 steps  (80%)
[17:28:04] Completed 202500 out of 250000 steps  (81%)
[17:36:54] Completed 205000 out of 250000 steps  (82%)
[17:45:44] Completed 207500 out of 250000 steps  (83%)
[17:54:33] Completed 210000 out of 250000 steps  (84%)
[18:03:22] Completed 212500 out of 250000 steps  (85%)
[18:07:33] 
[18:07:33] Folding@home Core Shutdown: INTERRUPTED
[18:07:37] CoreStatus = 66 (102)
[18:07:37] + Shutdown requested by user. Exiting.***** Got a SIGTERM signal (15)
[18:07:37] Killing all core threads

Folding@Home Client Shutdown.
restarting the client for the 3rd time...
susato
Site Moderator
Posts: 511
Joined: Fri Nov 30, 2007 4:57 am
Location: Team MacOSX
Contact:

Re: Project: 2671 (Run 11, Clone 32, Gen 9) seg fault

Post by susato »

HI Alpha,

That unit has already been completed successfully at 2009-04-18 04:13:05 by another user.
Thanks for reporting it here.
alpha754293
Posts: 383
Joined: Sun Jan 18, 2009 1:13 am

Re: Project: 2671 (Run 11, Clone 32, Gen 9) seg fault

Post by alpha754293 »

susato wrote:HI Alpha,

That unit has already been completed successfully at 2009-04-18 04:13:05 by another user.
Thanks for reporting it here.
Seems like P2671 has been getting a lot more seg faults on my system than any other project so far. I've had at least TWO WUs, and both seg faulted twice (each).
Post Reply