FahMon shows it will take another 5d 15h 23mn to finish. It will not make the deadline of 2d 11h 55mn.
Bruce. We can add the 2669 as something that is having this problem as well. Do we know if someone could look @ 171.64.65.56 and see if there are any more of these funky/small WU's waiting to be assigned?
From the terminal window:
Code: Select all
[04:46:43] Completed 237500 out of 250000 steps (95%)
[05:01:38] Completed 240000 out of 250000 steps (96%)
[05:15:02] - Autosending finished units... [August 11 05:15:02 UTC]
[05:15:02] Trying to send all finished work units
[05:15:02] + No unsent completed units remaining.
[05:15:02] - Autosend completed
[05:16:35] Completed 242500 out of 250000 steps (97%)
[05:31:31] Completed 245000 out of 250000 steps (98%)
[05:46:31] Completed 247500 out of 250000 steps (99%)
Writing final coordinates.
[06:01:29] Completed 250000 out of 250000 steps (100%)
Average load imbalance: 5.3 %
Part of the total run time spent waiting due to load imbalance: 3.5 %
Steps where the load balancing was limited by -rdd, -rcon and/or -dds: Z 0 %
Parallel run - timing based on wallclock.
NODE (s) Real (s) (%)
Time: 89679.000 89679.000 100.0
1d00h54:39
(Mnbf/s) (GFlops) (ns/day) (hour/ns)
Performance: 144.606 6.073 0.482 49.821
gcq#0: Thanx for Using GROMACS - Have a Nice Day
[06:01:30] DynamicWrapper: Finished Work Unit: sleep=10000
[06:01:40]
[06:01:40] Finished Work Unit:
[06:01:40] - Reading up to 21168720 from "work/wudata_06.trr": Read 21168720
[06:01:40] trr file hash check passed.
[06:01:40] - Reading up to 27132440 from "work/wudata_06.xtc": Read 27132440
[06:01:40] xtc file hash check passed.
[06:01:40] edr file hash check passed.
[06:01:40] logfile size: 181474
[06:01:40] Leaving Run
[06:01:40] - Writing 48627386 bytes of core data to disk...
[06:01:41] ... Done.
[06:01:44] - Shutting down core
[06:01:44]
[06:01:44] Folding@home Core Shutdown: FINISHED_UNIT
Error encountered before initializing MPICH
[06:05:02] CoreStatus = 64 (100)
[06:05:02] Unit 6 finished with 65 percent of time to deadline remaining.
[06:05:02] Updated performance fraction: 0.655339
[06:05:02] Sending work to server
[06:05:02] Project: 2677 (Run 3, Clone 28, Gen 34)
[06:05:02] + Attempting to send results [August 11 06:05:02 UTC]
[06:05:02] - Reading file work/wuresults_06.dat from core
[06:05:02] (Read 48627386 bytes from disk)
[06:05:02] Connecting to http://171.64.65.56:8080/
[06:05:28] Posted data.
[06:05:28] Initial: 0000; - Uploaded at ~1319 kB/s
[06:05:38] - Averaged speed for that direction ~1062 kB/s
[06:05:38] + Results successfully sent
[06:05:38] Thank you for your contribution to Folding@Home.
[06:05:38] + Number of Units Completed: 58
[06:05:39] - Warning: Could not delete all work unit files (6): Core file absent[06:05:39] Trying to send all finished work units
[06:05:39] + No unsent completed units remaining.
[06:05:39] - Preparing to get new work unit...
[06:05:39] + Attempting to get work packet
[06:05:39] - Will indicate memory of 1505 MB
[06:05:39] - Connecting to assignment server
[06:05:39] Connecting to http://assign.stanford.edu:8080/
[06:05:40] Posted data.
[06:05:40] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[06:05:40] + News From Folding@Home: Welcome to Folding@Home
[06:05:40] Loaded queue successfully.
[06:05:40] Connecting to http://171.64.65.56:8080/
[06:05:46] Posted data.
[06:05:46] Initial: 0000; - Receiving payload (expected size: 1509777)
[06:05:49] - Downloaded at ~491 kB/s
[06:05:49] - Averaged speed for that direction ~929 kB/s
[06:05:49] + Received work.
[06:05:49] Trying to send all finished work units
[06:05:49] + No unsent completed units remaining.
[06:05:49] + Closed connections
[06:05:49]
[06:05:49] + Processing work unit
[06:05:49] At least 4 processors must be requested.Core required: FahCore_a2.exe[06:05:49] Core found.
[06:05:49] Working on queue slot 07 [August 11 06:05:49 UTC]
[06:05:49] + Working ...
[06:05:49] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -suffix 07 -priority 96 -checkpoint 15 -verbose -lifeline 8230 -version 624'
[06:05:49]
[06:05:49] *------------------------------*
[06:05:49] Folding@Home Gromacs SMP Core
[06:05:49] Version 2.08 (Mon May 18 14:47:42 PDT 2009)
[06:05:49]
[06:05:49] Preparing to commence simulation
[06:05:49] - Ensuring status. Please wait.
[06:05:49] Called DecompressByteArray: compressed_data_size=1509265 data_size=23977801, decompressed_data_size=23977801 diff=0
[06:05:50] - Digital signature verified
[06:05:50]
[06:05:50] Project: 2669 (Run 7, Clone 4, Gen 185)
[06:05:50]
[06:05:50] Assembly optimizations on if available.
[06:05:50] Entering M.D.
[06:05:59] (Run 7, Clone 4, Gen 185)
[06:05:59]
[06:05:59] Entering M.D.
NNODES=4, MYRANK=0, HOSTNAME=RHEL4N23.lab1.com
NNODES=4, MYRANK=1, HOSTNAME=RHEL4N23.lab1.com
NNODES=4, MYRANK=2, HOSTNAME=RHEL4N23.lab1.com
NNODES=4, MYRANK=3, HOSTNAME=RHEL4N23.lab1.com
NODEID=0 argc=22
:-) G R O M A C S (-:
Groningen Machine for Chemical Simulation
:-) VERSION 4.0.99_development_20090425 (-:
Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2008, The GROMACS development team,
check out http://www.gromacs.org for more information.
:-) mdrun (-:
Reading file work/wudata_07.tpr, VERSION 3.3.99_development_20070618 (single precision)
NODEID=1 argc=22
NODEID=2 argc=22
NODEID=3 argc=22
Note: tpx file_version 48, software version 65
NOTE: The tpr file used for this simulation is in an old format, for less memory usage and possibly more performance create a new tpr file with an up to date version of grompp
Making 1D domain decomposition 1 x 1 x 4
starting mdrun '22869 system'
46500000 steps, 93000.0 ps (continuing from step 46250000, 92500.0 ps).
[06:06:27] Completed 0 out of 250000 steps (0%)
[07:35:07] Completed 2500 out of 250000 steps (1%)
[09:04:20] Completed 5000 out of 250000 steps (2%)
[10:33:32] Completed 7500 out of 250000 steps (3%)
[11:15:03] - Autosending finished units... [August 11 11:15:03 UTC]
[11:15:03] Trying to send all finished work units
[11:15:03] + No unsent completed units remaining.
[11:15:03] - Autosend completed
[12:02:48] Completed 10000 out of 250000 steps (4%)
[13:32:06] Completed 12500 out of 250000 steps (5%)
[15:01:22] Completed 15000 out of 250000 steps (6%)
[16:30:41] Completed 17500 out of 250000 steps (7%)
[17:15:03] - Autosending finished units... [August 11 17:15:03 UTC]
[17:15:03] Trying to send all finished work units
[17:15:03] + No unsent completed units remaining.
[17:15:03] - Autosend completed
[17:59:56] Completed 20000 out of 250000 steps (8%)
Have a good day.
BrokenWolf