Page 1 of 1

Project: 2665 (Run 2, Clone 156, Gen 77)

Posted: Sat Aug 15, 2009 8:41 pm
by patonb
Very odd eue on my smp

Code: Select all

--- Opening Log file [August 16 14:55:11 UTC] 


# Windows SMP Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.24R3

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\fah
Executable: C:\fah\fah.exe
Arguments: -smp -deino -verbosity 9 

[14:55:11] - Ask before connecting: No
[14:55:11] - User name: patonb (Team 12864)
[14:55:11] - User ID: 74791DF51B1E6EC6
[14:55:11] - Machine ID: 1
[14:55:11] 
[14:55:12] Loaded queue successfully.
[14:55:12] 
[14:55:12] - Autosending finished units... [August 16 14:55:12 UTC]
[14:55:12] + Processing work unit
[14:55:12] Trying to send all finished work units
[14:55:12] Work type a1 not eligible for variable processors
[14:55:12] + No unsent completed units remaining.
[14:55:12] Core required: FahCore_a1.exe
[14:55:12] - Autosend completed
[14:55:12] Core found.
[14:55:12] Working on queue slot 01 [August 16 14:55:12 UTC]
[14:55:12] + Working ...
[14:55:12] - Calling 'mpiexec -np 4 -channel shm -env MPICH_USE_SMP_OPTIMIZATIONS 1 -host 127.0.0.1 FahCore_a1.exe -dir work/ -suffix 01 -checkpoint 15 -verbose -lifeline 124 -version 624'

[14:55:16] 
[14:55:16] *------------------------------*
[14:55:16] Folding@Home Gromacs SMP Core
[14:55:16] Version 1.76 (February 23, 2008)
[14:55:16] 
[14:55:16] Preparing to commence simulation
[14:55:17] - Ensuring status. Please wait.
[14:55:34] - Looking at optimizations...
[14:55:34] - Working with standard loops on this execution.
[14:55:34] - Previous termination of core was improper.
[14:55:34] - Going to use standard loops.
[14:55:34] - Files status OK
[14:55:54] - Expanded 4823587 -> 24810145 (decompressed 514.3 percent)
[14:55:54] 
[14:55:54] Project: 2665 (Run 2, Clone 156, Gen 77)
[14:55:54] 
[14:56:07] Entering M.D.
[14:56:20] Calling FAH init
[14:56:22] Read topology
[14:56:23] s
[14:56:23] Writing local files
[14:56:23] int)
[14:56:23] Read checkpoint
[14:56:23] Protein: HGG with glycosylations
[14:56:23] Writing local files
[14:56:37] Extra SSE boost OK.
[14:56:38] Writing local files
[14:56:38] Completed 0 out of 250000 steps  (0 percent)
[15:11:40] Timered checkpoint triggered.
[15:26:40] Timered checkpoint triggered.
[15:41:43] Timered checkpoint triggered.
[15:49:02] Writing local files
[15:49:02] Completed 2500 out of 250000 steps  (1 percent)
[16:04:04] Timered checkpoint triggered.
[16:19:05] Timered checkpoint triggered.
[16:34:07] Timered checkpoint triggered.
[16:41:36] Writing local files
[16:41:36] Completed 5000 out of 250000 steps  (2 percent)
[16:56:40] Timered checkpoint triggered.
[17:11:41] Timered checkpoint triggered.
[17:26:43] Timered checkpoint triggered.
[17:35:06] Writing local files
[17:35:06] Completed 7500 out of 250000 steps  (3 percent)
[17:50:07] Timered checkpoint triggered.
[18:05:09] Timered checkpoint triggered.
[18:20:11] Timered checkpoint triggered.
[18:30:23] Writing local files
[18:30:24] Completed 10000 out of 250000 steps  (4 percent)
[18:45:26] Timered checkpoint triggered.
[19:00:26] Timered checkpoint triggered.
[19:15:27] Timered checkpoint triggered.
[19:24:49] Writing local files
[19:24:49] Completed 12500 out of 250000 steps  (5 percent)
[19:39:50] Timered checkpoint triggered.
[19:54:52] Timered checkpoint triggered.
[20:09:54] Timered checkpoint triggered.
[20:18:43] Writing local files
[20:18:43] Completed 15000 out of 250000 steps  (6 percent)
[20:19:05] Warning:  long 1-4 interactions
[20:19:09] Gromacs cannot continue further.
[20:19:09] Going to send back what have done.
[20:19:09] logfile size: 29854
[20:19:09] - Writing 30390 bytes of core data to disk...
[20:19:09]   ... Done.
[20:19:09] - Failed to delete work/wudata_01.dyn
[20:19:09] - Failed to delete work/wudata_01.sas
[20:19:09] - Failed to delete work/wudata_01.goe
[20:19:09] Warning:  check for stray files
[20:19:10] 
[20:19:10] Folding@home Core Shutdown: EARLY_UNIT_END
[20:19:10] 
[20:19:10] Folding@home Core Shutdown: EARLY_UNIT_END
[20:19:14] CoreStatus = 63 (99)
[20:19:14] + Error starting Folding@Home core.
[20:19:19] 
[20:19:19] + Processing work unit
[20:19:19] Work type a1 not eligible for variable processors
[20:19:19] Core required: FahCore_a1.exe
[20:19:19] Core found.
[20:19:19] Working on queue slot 01 [August 16 20:19:19 UTC]
[20:19:19] + Working ...
[20:19:19] - Calling 'mpiexec -np 4 -channel shm -env MPICH_USE_SMP_OPTIMIZATIONS 1 -host 127.0.0.1 FahCore_a1.exe -dir work/ -suffix 01 -checkpoint 15 -verbose -lifeline 124 -version 624'

[20:19:24] 
[20:19:24] *------------------------------*
[20:19:24] Folding@Home Gromacs SMP Core
[20:19:24] Version 1.76 (February 23, 2008)
[20:19:24] 
[20:19:24] Preparing to commence simulation
[20:19:24] - Ensuring status. Please wait- Created dyn
[20:19:24] - Files status OK
[20:19:24] 
[20:19:24] Folding@home Core Shutdown: MISSING_WORK_FILES
[20:19:24] Finalizing output
[20:19:41] ation of core was improper.
[20:19:41] - Files status OK
[20:21:41] 
[20:21:41] Folding@home Core Shutdown: MISSING_WORK_FILES
[20:21:41] Finalizing output
[20:21:44] CoreStatus = 1 (1)
[20:21:44] Sending work to server
[20:21:44] Project: 2665 (Run 2, Clone 156, Gen 77)


[20:21:44] + Attempting to send results [August 16 20:21:44 UTC]
[20:21:44] - Reading file work/wuresults_01.dat from core
[20:21:44]   (Read 30390 bytes from disk)
[20:21:44] Connecting to http://171.64.65.64:8080/
[20:21:45] Posted data.
[20:21:45] Initial: 0000; - Uploaded at ~15 kB/s
[20:21:46] - Averaged speed for that direction ~40 kB/s
[20:21:46] + Results successfully sent
[20:21:46] Thank you for your contribution to Folding@Home.
[20:22:06] - Warning: Could not delete all work unit files (1): Core returned invalid code
[20:22:06] Trying to send all finished work units
[20:22:06] + No unsent completed units remaining.
[20:22:06] - Preparing to get new work unit...
[20:22:06] Cleaning up work directory
[20:22:06] + Attempting to get work packet
[20:22:06] - Will indicate memory of 960 MB
[20:22:06] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 15, Stepping: 13
[20:22:06] - Connecting to assignment server
[20:22:06] Connecting to http://assign.stanford.edu:8080/
[20:22:06] Posted data.
[20:22:06] Initial: 40AB; - Successful: assigned to (171.64.65.64).
[20:22:06] + News From Folding@Home: Welcome to Folding@Home
[20:22:06] Loaded queue successfully.
[20:22:06] Connecting to http://171.64.65.64:8080/
[20:22:12] Posted data.
[20:22:12] Initial: 0000; - Receiving payload (expected size: 4720548)
[20:22:40] - Downloaded at ~164 kB/s
[20:22:40] - Averaged speed for that direction ~166 kB/s
[20:22:40] + Received work.
[20:22:40] Trying to send all finished work units
[20:22:40] + No unsent completed units remaining.
[20:22:40] + Closed connections

Re: Project: 2665 (Run 2, Clone 156, Gen 77)

Posted: Sat Aug 15, 2009 8:45 pm
by MtM
Look like another very slow one, is the download size +- 1.5mb again?

Re: Project: 2665 (Run 2, Clone 156, Gen 77)

Posted: Sat Aug 15, 2009 10:20 pm
by patonb
Coudn't tell yha... I noticed it when fahmon claimed i had a 0% done again today.

Actually, the new one is doing the same time frame: It too is a 2665.. (Run 3, Clone 299, Gen 107)

Code: Select all

[20:51:42] - Calling 'mpiexec -np 4 -channel shm -env MPICH_USE_SMP_OPTIMIZATIONS 1 -host 127.0.0.1 FahCore_a1.exe -dir work/ -suffix 02 -checkpoint 15 -verbose -lifeline 2372 -version 624'

[20:52:11] 
[20:52:11] *------------------------------*
[20:52:11] Folding@Home Gromacs SMP Core
[20:52:11] Version 1.76 (February 23, 2008)
[20:52:11] 
[20:52:11] Preparing to commence simulation
[20:52:11] - Ensuring status. Please wait.
[20:52:28] - Looking at optimizations...
[20:52:28] - Working with standard loops on this execution.
[20:52:28] - Previous termination of core was improper.
[20:52:28] - Going to use standard loops.
[20:52:28] - Files status OK
[20:53:02] - Expanded 4720036 -> 24426905 (decompressed 517.5 percent)
[20:53:04] 
[20:53:04] Project: 2665 (Run 3, Clone 299, Gen 107)
[20:53:04] 
[20:53:07] Entering M.D.
[20:53:16] Calling FAH init
[20:53:18] Read topology
[20:53:18] (Starting from checkpoint)
[20:53:18] Read checkpoint
[20:53:19] s  (0 percent)
[20:53:19]  water
[20:53:19] Writing local files
[20:53:19] Completed 718 out of 250000 steps  (0 percent)
[20:53:33] Extra SSE boost OK.
[21:08:20] Timered checkpoint triggered.
[21:23:22] Timered checkpoint triggered.
[21:29:21] Writing local files
[21:29:22] Completed 2500 out of 250000 steps  (1 percent)
[21:44:22] Timered checkpoint triggered.
[21:59:23] Timered checkpoint triggered.
[22:14:22] Timered checkpoint triggered.
[22:17:52] Writing local files
[22:17:52] Completed 5000 out of 250000 steps  (2 percent)

Re: Project: 2665 (Run 2, Clone 156, Gen 77)

Posted: Sat Aug 15, 2009 10:58 pm
by MtM
Show the fahlog part where it says 'Receiving payload: xxxx ', even if it's just for my curiousity ( think it's almost certain ). It's before the wu starts offcourse :)

Re: Project: 2665 (Run 2, Clone 156, Gen 77)

Posted: Sat Aug 15, 2009 11:13 pm
by patonb
Oops.. I forgot I had a rebooyt right after i d/l'd that unit

Code: Select all

[20:21:44] Project: 2665 (Run 2, Clone 156, Gen 77)


[20:21:44] + Attempting to send results [August 16 20:21:44 UTC]
[20:21:44] - Reading file work/wuresults_01.dat from core
[20:21:44]   (Read 30390 bytes from disk)
[20:21:44] Connecting to http://171.64.65.64:8080/
[20:21:45] Posted data.
[20:21:45] Initial: 0000; - Uploaded at ~15 kB/s
[20:21:46] - Averaged speed for that direction ~40 kB/s
[20:21:46] + Results successfully sent
[20:21:46] Thank you for your contribution to Folding@Home.
[20:22:06] - Warning: Could not delete all work unit files (1): Core returned invalid code
[20:22:06] Trying to send all finished work units
[20:22:06] + No unsent completed units remaining.
[20:22:06] - Preparing to get new work unit...
[20:22:06] Cleaning up work directory
[20:22:06] + Attempting to get work packet
[20:22:06] - Will indicate memory of 960 MB
[20:22:06] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 15, Stepping: 13
[20:22:06] - Connecting to assignment server
[20:22:06] Connecting to http://assign.stanford.edu:8080/
[20:22:06] Posted data.
[20:22:06] Initial: 40AB; - Successful: assigned to (171.64.65.64).
[20:22:06] + News From Folding@Home: Welcome to Folding@Home
[20:22:06] Loaded queue successfully.
[20:22:06] Connecting to http://171.64.65.64:8080/
[20:22:12] Posted data.
[20:22:12] Initial: 0000; - Receiving payload (expected size: 4720548)
[20:22:40] - Downloaded at ~164 kB/s
[20:22:40] - Averaged speed for that direction ~166 kB/s
[20:22:40] + Received work.
[20:22:40] Trying to send all finished work units
[20:22:40] + No unsent completed units remaining.
[20:22:40] + Closed connections
[20:22:45] 
[20:22:45] + Processing work unit
[20:22:45] Work type a1 not eligible for variable processors
[20:22:45] Core required: FahCore_a1.exe
[20:22:45] Core found.
[20:22:45] Working on queue slot 02 [August 16 20:22:45 UTC]
[20:22:45] + Working ...
[20:22:45] - Calling 'mpiexec -

Re: Project: 2665 (Run 2, Clone 156, Gen 77)

Posted: Sat Aug 15, 2009 11:23 pm
by MtM
That seems like a normal wu :?:

Not like the others posted running on one fahcore, maybe as 7im suggested closing the client and then restarting it should work. Or closeing it, and watching which processes start to take more resources.

There's a problem there but it's not the wu I think.

Re: Project: 2665 (Run 2, Clone 156, Gen 77)

Posted: Sun Aug 16, 2009 1:21 am
by patonb
Oddly, the 2 are running the same speed... it is only a e2180...

I just can't remember my smp eueing before..