And started over after reboot. This is what I got
Code: Select all
[jim@localhost ~]$ cd fold2
[jim@localhost fold2]$ ./fah6 -smp 4 -verbosity 9
Note: Please read the license agreement (fah6 -license). Further
use of this software requires that you have read and accepted this agreement.
4 cores detected
--- Opening Log file [September 6 16:19:09 UTC]
# Linux SMP Console Edition ###################################################
###############################################################################
Folding@Home Client Version 6.24beta
http://folding.stanford.edu
###############################################################################
###############################################################################
Launch directory: /home/jim/fold2
Executable: ./fah6
Arguments: -smp 4 -verbosity 9
[16:19:09] - Ask before connecting: No
[16:19:09] - User name: dschief (Team 13761)
[16:19:09] - User ID: ###############
[16:19:09] - Machine ID: 2
[16:19:09]
[16:19:09] Work directory not found. Creating...
[16:19:09] Could not open work queue, generating new queue...
[16:19:09] - Preparing to get new work unit...
[16:19:09] - Autosending finished units... [16:19:09]
[16:19:09] Trying to send all finished work units
[16:19:09] + No unsent completed units remaining.
[16:19:09] - Autosend completed
[16:19:09] + Attempting to get work packet
[16:19:09] - Will indicate memory of 3924 MB
[16:19:09] - Connecting to assignment server
[16:19:09] Connecting to http://assign.stanford.edu:8080/
[16:19:09] Posted data.
[16:19:09] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[16:19:09] + News From Folding@Home: Welcome to Folding@Home
[16:19:09] Loaded queue successfully.
[16:19:09] Connecting to http://171.64.65.56:8080/
[16:19:18] Posted data.
[16:19:18] Initial: 0000; - Receiving payload (expected size: 1515973)
[16:19:27] - Downloaded at ~164 kB/s
[16:19:27] - Averaged speed for that direction ~164 kB/s
[16:19:27] + Received work.
[16:19:27] + Closed connections
[16:19:27]
[16:19:27] + Processing work unit
[16:19:27] Core required: FahCore_a2.exe
[16:19:27] Core found.
[16:19:27] Working on queue slot 01 [September 6 16:19:27 UTC]
[16:19:27] + Working ...
[16:19:27] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 01 -checkpoint 15 -verbose -lifeline 2494 -version 624'
[16:19:28]
[16:19:28] *------------------------------*
[16:19:28] Folding@Home Gromacs SMP Core
[16:19:28] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[16:19:28]
[16:19:28] Preparing to commence simulation
[16:19:28] - Ensuring status. Please wait.
[16:19:28] Files status OK
[16:19:28] - Expanded 1515461 -> 24004801 (decompressed 1583.9 percent)
[16:19:28] Called DecompressByteArray: compressed_data_size=1515461 data_size=24004801, decompressed_data_size=24004801 diff=0
[16:19:28] - Digital signature verified
[16:19:28]
[16:19:28] Project: 2675 (Run 2, Clone 93, Gen 141)
[16:19:28]
[16:19:28] Assembly optimizations on if available.
[16:19:28] Entering M.D.
[16:19:38] Run 2, Clone 93, Gen 141)
[16:19:38]
[16:19:38] Entering M.D.
NNODES=4, MYRANK=0, HOSTNAME=localhost.localdomain
NNODES=4, MYRANK=1, HOSTNAME=localhost.localdomain
NNODES=4, MYRANK=2, HOSTNAME=localhost.localdomain
NNODES=4, MYRANK=3, HOSTNAME=localhost.localdomain
NODEID=0 argc=20
NODEID=1 argc=20
NODEID=2 argc=20
NODEID=3 argc=20
Reading file work/wudata_01.tpr, VERSION 3.3.99_development_20070618 (single precision)
Note: tpx file_version 48, software version 68
NOTE: The tpr file used for this simulation is in an old format, for less memory usage and possibly more performance create a new tpr file with an up to date version of grompp
Making 1D domain decomposition 1 x 1 x 4
starting mdrun '22878 system in water'
35500004 steps, 71000.0 ps (continuing from step 35250004, 70500.0 ps).
------------------[16:20:00] Completed 0 out of 250000 steps (0%)
[16:20:00]
[16:20:00] Folding@home Core Shutdown: INTERRUPTED
-------------------------------------
Program mdrun, VERSION 4.0.99_development_20090605
Source code file: md.c, line: 2169
Fatal error:
NaN detected at step 35250004
For more information and tips for trouble shooting please check the GROMACS Wiki at
http://wiki.gromacs.org/index.php/Errors
-------------------------------------------------------
Thanx for Using GROMACS - Have a Nice Day
Error on node 1, will try to stop all the nodes
Halting parallel program mdrun on CPU 1 out of 4
-------------------------------------------------------
Program mdrun, VERSION 4.0.99_development_20090605
Source code file: md.c, line: 2169
Fatal error:
NaN detected at step 35250004
For more information and tips for trouble shooting please check the GROMACS Wiki at
http://wiki.gromacs.org/index.php/Errors
-------------------------------------------------------
Thanx for Using GROMACS - Have a Nice Day
Error on node 3, will try to stop all the nodes
Halting parallel program mdrun on CPU 3 out of 4
gcq#0: Thanx for Using GROMACS - Have a Nice Day
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 3
application called MPI_Abort(MPI_COMM_WORLD, 102) - process 0
gcq#0: Thanx for Using GROMACS - Have a Nice Day
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 1
-------------------------------------------------------
Program mdrun, VERSION 4.0.99_development_20090605
Source code file: md.c, line: 2169
Fatal error:
NaN detected at step 35250004
For more information and tips for trouble shooting please check the GROMACS Wiki at
http://wiki.gromacs.org/index.php/Errors
-------------------------------------------------------
Thanx for Using GROMACS - Have a Nice Day
Error on node 2, will try to stop all the nodes
Halting parallel program mdrun on CPU 2 out of 4
gcq#0: Thanx for Using GROMACS - Have a Nice Day
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 2
[16:20:04] CoreStatus = FF (255)
[16:20:04] Sending work to server
[16:20:04] Project: 2675 (Run 2, Clone 93, Gen 141)
[16:20:04] - Error: Could not get length of results file work/wuresults_01.dat
[16:20:04] - Error: Could not read unit 01 file. Removing from queue.
[16:20:04] Trying to send all finished work units
[16:20:04] + No unsent completed units remaining.
[16:20:04] - Preparing to get new work unit...
[16:20:04] + Attempting to get work packet
[16:20:04] - Will indicate memory of 3924 MB
[16:20:04] - Connecting to assignment server
[16:20:04] Connecting to http://assign.stanford.edu:8080/
[16:20:04] Posted data.
[16:20:04] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[16:20:04] + News From Folding@Home: Welcome to Folding@Home
[16:20:04] Loaded queue successfully.
[16:20:04] Connecting to http://171.64.65.56:8080/
[16:20:11] Posted data.
[16:20:11] Initial: 0000; - Receiving payload (expected size: 1515973)
[16:20:21] - Downloaded at ~148 kB/s
[16:20:21] - Averaged speed for that direction ~156 kB/s
[16:20:21] + Received work.
[16:20:21] Trying to send all finished work units
[16:20:21] + No unsent completed units remaining.
[16:20:21] + Closed connections
[16:20:26]
[16:20:26] + Processing work unit
[16:20:26] Core required: FahCore_a2.exe
[16:20:26] Core found.
[16:20:26] Working on queue slot 02 [September 6 16:20:26 UTC]
[16:20:26] + Working ...
[16:20:26] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 02 -checkpoint 15 -verbose -lifeline 2494 -version 624'
[16:20:26]
[16:20:26] *------------------------------*
[16:20:26] Folding@Home Gromacs SMP Core
[16:20:26] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[16:20:26]
[16:20:26] Preparing to commence simulation
[16:20:26] - Ensuring status. Please wait.
[16:20:27] Called DecompressByteArray: compressed_data_size=1515461 data_size=24004801, decompressed_data_size=24004801 diff=0
[16:20:27] - Digital signature verified
[16:20:27]
[16:20:27] Project: 2675 (Run 2, Clone 93, Gen 141)
[16:20:27]
[16:20:27] Assembly optimizations on if available.
[16:20:27] Entering M.D.
[16:20:36] Run 2, Clone 93, Gen 141)
[16:20:36]
[16:20:37] Entering M.D.
NNODES=4, MYRANK=0, HOSTNAME=localhost.localdomain
NNODES=4, MYRANK=1, HOSTNAME=localhost.localdomain
NODEID=1 argc=20
NNODES=4, MYRANK=2, HOSTNAME=localhost.localdomain
NODEID=2 argc=20
NNODES=4, MYRANK=3, HOSTNAME=localhost.localdomain
NODEID=0 argc=20
NODEID=3 argc=20
Reading file work/wudata_02.tpr, VERSION 3.3.99_development_20070618 (single precision)
Note: tpx file_version 48, software version 68
NOTE: The tpr file used for this simulation is in an old format, for less memory usage and possibly more performance create a new tpr file with an up to date version of grompp
Making 1D domain decomposition 1 x 1 x 4
starting mdrun '22878 system in water'
35500004 steps, 71000.0 ps (continuing from step 35250004, 70500.0 ps).
^C[16:20:56] ***** Got an Activate signal (2)
application called MPI_Abort(MPI_COMM_WORLD, 102) - process 0
[16:20:56] Killing all core threads
Folding@Home Client Shutdown.
[jim@localhost fold2]$ [0]0:Return code = 102
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Quit
[0]3:Return code = 0, signaled with Quit