Code: Select all
:-) G R O M A C S (-:
Groningen Machine for Chemical Simulation
:-) VERSION 4.0.99_development_20090307 (-:
Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2008, The GROMACS development team,
check out http://www.gromacs.org for more information.
:-) mdrun (-:
Reading file work/wudata_09.tpr, VERSION 3.3.99_development_20070618 (single pre
cision)
Note: tpx file_version 48, software version 64
Reading checkpoint file work/wudata_09.cpt generated: Fri Apr 24 08:19:50 2009
-------------------------------------------------------
Program mdrun, VERSION 4.0.99_development_20090307
Source code file: checkpoint.c, line: 1151
Fatal error:
Checkpoint file is for a system of 147225 atoms, while the current system consis
ts of 146898 atoms
For more information and tips for trouble shooting please check the GROMACS Wiki
at
http://wiki.gromacs.org/index.php/Errors
-------------------------------------------------------
Thanx for Using GROMACS - Have a Nice Day
Error on node 0, will try to stop all the nodes
Halting parallel program mdrun on CPU 0 out of 4
gcq#0: Thanx for Using GROMACS - Have a Nice Day
[cli_0]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 0
[0]0:Return code = 255
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Quit
[0]3:Return code = 0, signaled with Quit
[11:55:03] CoreStatus = FF (255)
[11:55:03] Sending work to server
[11:55:03] Project: 2671 (Run 22, Clone 85, Gen 17)
[11:55:03] - Error: Could not get length of results file work/wuresults_09.dat
[11:55:03] - Error: Could not read unit 09 file. Removing from queue.
[11:55:03] Trying to send all finished work units
[11:55:03] + No unsent completed units remaining.
[11:55:03] - Preparing to get new work unit...
[11:55:03] + Attempting to get work packet
[11:55:03] - Will indicate memory of 16003 MB
[11:55:03] - Connecting to assignment server
[11:55:03] Connecting to http://assign.stanford.edu:8080/
[11:55:03] Posted data.
[11:55:03] Initial: 43AB; - Successful: assigned to (171.67.108.24).
[11:55:03] + News From Folding@Home: Welcome to Folding@Home
[11:55:03] Loaded queue successfully.
[11:55:03] Connecting to http://171.67.108.24:8080/
[11:55:10] Posted data.
[11:55:10] Initial: 0000; - Receiving payload (expected size: 4845286)
[11:55:31] - Downloaded at ~225 kB/s
[11:55:31] - Averaged speed for that direction ~330 kB/s
[11:55:31] + Received work.
[11:55:31] Trying to send all finished work units
[11:55:31] + No unsent completed units remaining.
[11:55:31] + Closed connections
[11:55:36]
[11:55:36] + Processing work unit
[11:55:36] Core required: FahCore_a2.exe
[11:55:36] Core found.
[11:55:36] Working on queue slot 00 [April 28 11:55:36 UTC]
[11:55:36] + Working ...
[11:55:36] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work
/ -suffix 00 -checkpoint 15 -verbose -lifeline 29452 -version 624'
[11:55:36]
[11:55:36] *------------------------------*
[11:55:36] Folding@Home Gromacs SMP Core
[11:55:36] Version 2.07 (Sun Apr 19 14:51:09 PDT 2009)
[11:55:36]
[11:55:36] Preparing to commence simulation
[11:55:36] - Ensuring status. Please wait.
[11:55:46] - Looking at optimizations...
[11:55:46] - Working with standard loops on this execution.
[11:55:46] - Files status OK
[11:55:47] - Expanded 4844774 -> 24012685 (decompressed 495.6 percent)
[11:55:47] Called DecompressByteArray: compressed_data_size=4844774 data_size=24
012685, decompressed_data_size=24012685 diff=0
[11:55:47] - Digital signature verified
[11:55:47]
[11:55:47] Project: 2671 (Run 22, Clone 98, Gen 17)
[11:55:47]
[11:55:47] Entering M.D.
[11:55:53] Using Gromacs checkpoints
NNODES=4, MYRANK=0, HOSTNAME=computenode
NNODES=4, MYRANK=1, HOSTNAME=computenode
NNODES=4, MYRANK=2, HOSTNAME=computenode
NNODES=4, MYRANK=3, HOSTNAME=computenode
NODEID=0 argc=23
NODEID=1 argc=23
NODEID=3 argc=23
:-) G R O M A C S (-:
NODEID=2 argc=23
Groningen Machine for Chemical Simulation
:-) VERSION 4.0.99_development_20090307 (-:
Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2008, The GROMACS development team,
check out http://www.gromacs.org for more information.
:-) mdrun (-:
Reading file work/wudata_00.tpr, VERSION 3.3.99_development_20070618 (single pre
cision)
Note: tpx file_version 48, software version 64
Reading checkpoint file work/wudata_00.cpt generated: Mon Apr 27 13:07:16 2009
-------------------------------------------------------
Program mdrun, VERSION 4.0.99_development_20090307
Source code file: checkpoint.c, line: 1151
Fatal error:
Checkpoint file is for a system of 147117 atoms, while the current system consis
ts of 146898 atoms
For more information and tips for trouble shooting please check the GROMACS Wiki
at
http://wiki.gromacs.org/index.php/Errors
-------------------------------------------------------
Thanx for Using GROMACS - Have a Nice Day
Error on node 0, will try to stop all the nodes
Halting parallel program mdrun on CPU 0 out of 4
gcq#0: Thanx for Using GROMACS - Have a Nice Day
[cli_0]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 0
[0]0:Return code = 255
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Quit
[0]3:Return code = 0, signaled with Quit
[11:56:00] CoreStatus = FF (255)
[11:56:00] Sending work to server
[11:56:00] Project: 2671 (Run 22, Clone 98, Gen 17)
[11:56:00] - Error: Could not get length of results file work/wuresults_00.dat
[11:56:00] - Error: Could not read unit 00 file. Removing from queue.
[11:56:00] Trying to send all finished work units
[11:56:00] + No unsent completed units remaining.
[11:56:00] - Preparing to get new work unit...
[11:56:00] + Attempting to get work packet
[11:56:00] - Will indicate memory of 16003 MB
[11:56:00] - Connecting to assignment server
[11:56:00] Connecting to http://assign.stanford.edu:8080/
[11:56:01] Posted data.
[11:56:01] Initial: 43AB; - Successful: assigned to (171.67.108.24).
[11:56:01] + News From Folding@Home: Welcome to Folding@Home
[11:56:01] Loaded queue successfully.
[11:56:01] Connecting to http://171.67.108.24:8080/
[11:56:01] Posted data.
[11:56:01] Initial: 0000; - Error: Bad packet type from server, expected work as
signment
[11:56:02] - Attempt #1 to get work failed, and no other work to do.
Waiting before retry.
[11:56:19] + Attempting to get work packet
[11:56:19] - Will indicate memory of 16003 MB
[11:56:19] - Connecting to assignment server
[11:56:19] Connecting to http://assign.stanford.edu:8080/
[11:56:19] Posted data.
[11:56:19] Initial: 43AB; - Successful: assigned to (171.67.108.24).
[11:56:19] + News From Folding@Home: Welcome to Folding@Home
[11:56:19] Loaded queue successfully.
[11:56:19] Connecting to http://171.67.108.24:8080/
[11:56:26] Posted data.
[11:56:26] Initial: 0000; - Receiving payload (expected size: 4825625)
[11:56:38] - Downloaded at ~392 kB/s
[11:56:38] - Averaged speed for that direction ~343 kB/s
[11:56:38] + Received work.
[11:56:38] Trying to send all finished work units
[11:56:38] + No unsent completed units remaining.
[11:56:38] + Closed connections
[11:56:43]
[11:56:43] + Processing work unit
[11:56:43] Core required: FahCore_a2.exe
[11:56:43] Core found.
[11:56:43] Working on queue slot 01 [April 28 11:56:43 UTC]
[11:56:43] + Working ...
[11:56:43] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work
/ -suffix 01 -checkpoint 15 -verbose -lifeline 29452 -version 624'
[11:56:43]
[11:56:43] *------------------------------*
[11:56:43] Folding@Home Gromacs SMP Core
[11:56:43] Version 2.07 (Sun Apr 19 14:51:09 PDT 2009)
[11:56:43]
[11:56:43] Preparing to commence simulation
[11:56:43] - Ensuring status. Please wait.
[11:56:53] - Looking at optimizations...
[11:56:53] - Working with standard loops on this execution.
[11:56:53] - Files status OK
[11:56:54] - Expanded 4825113 -> 24057089 (decompressed 498.5 percent)
[11:56:54] Called DecompressByteArray: compressed_data_size=4825113 data_size=24
057089, decompressed_data_size=24057089 diff=0
[11:56:54] - Digital signature verified
[11:56:54]
[11:56:54] Project: 2671 (Run 18, Clone 87, Gen 17)
[11:56:54]
[11:56:54] Entering M.D.
[11:57:00] Using Gromacs checkpoints
NNODES=4, MYRANK=1, HOSTNAME=computenode
NNODES=4, MYRANK=2, HOSTNAME=computenode
NNODES=4, MYRANK=3, HOSTNAME=computenode
NNODES=4, MYRANK=0, HOSTNAME=computenode
NODEID=0 argc=23
NODEID=1 argc=23
NODEID=3 argc=23
NODEID=2 argc=23
:-) G R O M A C S (-:
Groningen Machine for Chemical Simulation
:-) VERSION 4.0.99_development_20090307 (-:
Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2008, The GROMACS development team,
check out http://www.gromacs.org for more information.
:-) mdrun (-:
Reading file work/wudata_01.tpr, VERSION 3.3.99_development_20070618 (single pre
cision)
Note: tpx file_version 48, software version 64
Reading checkpoint file work/wudata_01.cpt generated: Mon Apr 20 15:43:31 2009
-------------------------------------------------------
Program mdrun, VERSION 4.0.99_development_20090307
Source code file: checkpoint.c, line: 1151
Fatal error:
Checkpoint file is for a system of 147024 atoms, while the current system consis
ts of 147246 atoms
For more information and tips for trouble shooting please check the GROMACS Wiki
at
http://wiki.gromacs.org/index.php/Errors
-------------------------------------------------------
Thanx for Using GROMACS - Have a Nice Day
Error on node 0, will try to stop all the nodes
Halting parallel program mdrun on CPU 0 out of 4
gcq#0: Thanx for Using GROMACS - Have a Nice Day
[cli_0]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 0
[0]0:Return code = 255
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Quit
[0]3:Return code = 0, signaled with Quit
[11:57:07] CoreStatus = FF (255)
[11:57:07] Sending work to server
[11:57:07] Project: 2671 (Run 18, Clone 87, Gen 17)
[11:57:07] - Error: Could not get length of results file work/wuresults_01.dat
[11:57:07] - Error: Could not read unit 01 file. Removing from queue.
[11:57:07] Trying to send all finished work units
[11:57:07] + No unsent completed units remaining.
[11:57:07] - Preparing to get new work unit...
[11:57:07] + Attempting to get work packet
[11:57:07] - Will indicate memory of 16003 MB
[11:57:07] - Connecting to assignment server
[11:57:07] Connecting to http://assign.stanford.edu:8080/
[11:57:07] Posted data.
[11:57:07] Initial: 43AB; - Successful: assigned to (171.67.108.24).
[11:57:07] + News From Folding@Home: Welcome to Folding@Home
[11:57:07] Loaded queue successfully.
[11:57:07] Connecting to http://171.67.108.24:8080/
[11:57:08] Posted data.
[11:57:08] Initial: 0000; - Error: Bad packet type from server, expected work as
signment
[11:57:08] - Attempt #1 to get work failed, and no other work to do.
Waiting before retry.
There's a bunch more before and after it as well.
I just dumped the queue.dat, *.pdb in the client directory, and also the work directory as well and that seems to have fixed things/got it moving along again.
Any ideas as to what was wrong with it in the first place?