Page 1 of 1

Project: 2665 (Run 2, Clone 276, Gen 180) hangs

Posted: Tue Feb 09, 2010 6:02 pm
by gizmo
I've been having problems with 2665s in general, but a friend told me I should post my next one here so that you guys could take a look at it. Here's the output from my log for several different 2665s, all of which hang with the log entry "Entering M.D.":

Any help would be greatly appreciated, as I'm sure you guys don't really want me deleting WUs and retrying the request ad-nauseum in order to get around this.

Code: Select all

# SMP Client ##################################################################
###############################################################################

                       Folding@Home Client Version 6.02

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/fah
Executable: ./fah6
Arguments: -smp -verbosity 9 

[17:16:58] - Ask before connecting: No
[17:16:58] - User name: ChrisRichards (Team 45)
[17:16:58] - User ID: 1440AB7A28AA1143
[17:16:58] - Machine ID: 1
[17:16:58] 
[17:16:58] Work directory not found. Creating...
[17:16:58] Could not open work queue, generating new queue...
[17:16:58] - Autosending finished units...
[17:16:58] Trying to send all finished work units
[17:16:58] + No unsent completed units remaining.
[17:16:58] - Autosend completed
[17:16:58] - Preparing to get new work unit...
[17:16:58] + Attempting to get work packet
[17:16:58] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 15, Stepping: 10
[17:16:58] - Connecting to assignment server
[17:16:58] Connecting to http://assign.stanford.edu:8080/
[17:16:59] Posted data.
[17:16:59] Initial: 40AB; - Successful: assigned to (171.64.65.64).
[17:16:59] + News From Folding@Home: Welcome to Folding@Home
[17:17:00] Loaded queue successfully.
[17:17:00] Connecting to http://171.64.65.64:8080/
[17:17:05] Posted data.
[17:17:05] Initial: 0000; - Receiving payload (expected size: 4839252)
[17:18:19] - Downloaded at ~63 kB/s
[17:18:19] - Averaged speed for that direction ~63 kB/s
[17:18:19] + Received work.
[17:18:19] + Closed connections
[17:18:19] 
[17:18:19] + Processing work unit
[17:18:19] Core required: FahCore_a1.exe
[17:18:19] Core found.
[17:18:19] Working on Unit 01 [February 9 17:18:19]
[17:18:19] + Working ...
[17:18:19] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 01 -checkpoint 15 -verbose -lifeline 25232 -version 602'

[17:18:19] 
[17:18:19] *------------------------------*
[17:18:19] Folding@Home Gromacs SMP Core
[17:18:19] Version 1.74 (November 27, 2006)
[17:18:19] 
[17:18:19] Preparing to commence simulation
[17:18:19] - Ensuring status. Please wait- Created dyn
[17:18:19] - Files status OK
[17:18:20] - Expanded 4838740 -> 24810145 (decompressed 512.7 percent)
[17:18:20] - Starting from initial work packet
[17:18:20] 
[17:18:20] Project: 2665 (Run 2, Clone 276, Gen 180)
[17:18:20] 
[17:18:20] Assembly optimizations on if available.
[17:18:20] Entering M.D.
[17:18:37] percent)
[17:18:37] - Starting from initial work packet
[17:18:37] 
[17:18:37] Project: 2665 (Run 2, Clone 276, Gen 180)
[17:18:37] 
[17:18:37] Entering M.D.
[17:19:58] ***** Got a SIGTERM signal (15)
[17:19:58] Killing all core threads

Folding@Home Client Shutdown.


--- Opening Log file [February 9 17:31:10] 


# SMP Client ##################################################################
###############################################################################

                       Folding@Home Client Version 6.02

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/fah
Executable: ./fah6
Arguments: -smp -verbosity 9 

[17:31:10] - Ask before connecting: No
[17:31:10] - User name: ChrisRichards (Team 45)
[17:31:10] - User ID: 1440AB7A28AA1143
[17:31:10] - Machine ID: 1
[17:31:10] 
[17:31:10] Work directory not found. Creating...
[17:31:10] Could not open work queue, generating new queue...
[17:31:10] - Autosending finished units...
[17:31:10] Trying to send all finished work units
[17:31:10] + No unsent completed units remaining.
[17:31:10] - Autosend completed
[17:31:10] - Preparing to get new work unit...
[17:31:10] + Attempting to get work packet
[17:31:10] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 15, Stepping: 10
[17:31:10] - Connecting to assignment server
[17:31:10] Connecting to http://assign.stanford.edu:8080/
[17:31:11] Posted data.
[17:31:11] Initial: 40AB; - Successful: assigned to (171.64.65.64).
[17:31:11] + News From Folding@Home: Welcome to Folding@Home
[17:31:11] Loaded queue successfully.
[17:31:11] Connecting to http://171.64.65.64:8080/
[17:31:17] Posted data.
[17:31:17] Initial: 0000; - Receiving payload (expected size: 4839252)
[17:32:24] - Downloaded at ~70 kB/s
[17:32:24] - Averaged speed for that direction ~70 kB/s
[17:32:24] + Received work.
[17:32:24] + Closed connections
[17:32:24] 
[17:32:24] + Processing work unit
[17:32:24] Core required: FahCore_a1.exe
[17:32:24] Core found.
[17:32:24] Working on Unit 01 [February 9 17:32:24]
[17:32:24] + Working ...
[17:32:24] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 01 -checkpoint 15 -verbose -lifeline 25412 -version 602'

[17:32:24] 
[17:32:24] *------------------------------*
[17:32:24] Folding@Home Gromacs SMP Core
[17:32:24] Version 1.74 (November 27, 2006)
[17:32:24] 
[17:32:24] Preparing to commence simulation
[17:32:24] - Ensuring status. Please wait- Previous termination of core was improper.
[17:32:24] - Going to use standard loops.
[17:32:24] - Files status OK
[17:32:26] Starting from initial work packet
[17:32:26] 
[17:32:26] Project: 2665 (Run 2, Clone 276, Gen 180)
[17:32:26] 
[17:32:26] Assembly optimizations on if available.
[17:32:26] Entering M.D.
[17:32:27] n 180)
[17:32:27] 
[17:32:27] Entering M.D.
[17:32:42] om initial work packet
[17:32:42] 
[17:32:42] Project: 2665 (Run 2, Clone 276, Gen 180)
[17:32:42] 
[17:32:42] Entering M.D.
[17:42:50] ***** Got a SIGTERM signal (15)
[17:42:50] Killing all core threads

Folding@Home Client Shutdown.


--- Opening Log file [February 9 17:43:26] 


# SMP Client ##################################################################
###############################################################################

                       Folding@Home Client Version 6.02

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/fah
Executable: ./fah6
Arguments: -smp -verbosity 9 

[17:43:26] - Ask before connecting: No
[17:43:26] - User name: ChrisRichards (Team 45)
[17:43:26] - User ID: 1440AB7A28AA1143
[17:43:26] - Machine ID: 1
[17:43:26] 
[17:43:26] Work directory not found. Creating...
[17:43:26] Could not open work queue, generating new queue...
[17:43:26] - Preparing to get new work unit...
[17:43:26] - Autosending finished units...
[17:43:26] + Attempting to get work packet
[17:43:26] Trying to send all finished work units
[17:43:26] - Detect CPU.[17:43:26] + No unsent completed units remaining.
[17:43:26] - Autosend completed
 Vendor: GenuineIntel, Family: 6, Model: 15, Stepping: 10


--- Opening Log file [February 9 17:47:22] 


# SMP Client ##################################################################
###############################################################################

                       Folding@Home Client Version 6.02

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/fah
Executable: ./fah6
Arguments: -smp -verbosity 9 

[17:47:22] - Ask before connecting: No
[17:47:22] - User name: ChrisRichards (Team 45)
[17:47:22] - User ID: 1440AB7A28AA1143
[17:47:22] - Machine ID: 1
[17:47:22] 
[17:47:22] Work directory not found. Creating...
[17:47:22] Could not open work queue, generating new queue...
[17:47:22] - Autosending finished units...
[17:47:22] Trying to send all finished work units
[17:47:22] + No unsent completed units remaining.
[17:47:22] - Autosend completed
[17:47:22] - Preparing to get new work unit...
[17:47:22] + Attempting to get work packet
[17:47:22] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 15, Stepping: 10
[17:47:22] - Connecting to assignment server
[17:47:22] Connecting to http://assign.stanford.edu:8080/
[17:47:22] Posted data.
[17:47:22] Initial: 40AB; - Successful: assigned to (171.64.65.64).
[17:47:22] + News From Folding@Home: Welcome to Folding@Home
[17:47:22] Loaded queue successfully.
[17:47:22] Connecting to http://171.64.65.64:8080/
[17:47:27] Posted data.
[17:47:27] Initial: 0000; - Receiving payload (expected size: 4839252)
[17:48:31] - Downloaded at ~73 kB/s
[17:48:31] - Averaged speed for that direction ~73 kB/s
[17:48:31] + Received work.
[17:48:31] + Closed connections
[17:48:31] 
[17:48:31] + Processing work unit
[17:48:31] Core required: FahCore_a1.exe
[17:48:31] Core found.
[17:48:31] Working on Unit 01 [February 9 17:48:31]
[17:48:31] + Working ...
[17:48:31] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 01 -checkpoint 15 -verbose -lifeline 25593 -version 602'

[17:48:31] 
[17:48:31] *------------------------------*
[17:48:31] Folding@Home Gromacs SMP Core
[17:48:31] Version 1.74 (November 27, 2006)
[17:48:31] 
[17:48:31] Preparing to commence simulation
[17:48:31] - Ensuring status. Please wait.
[17:48:32] - Starting from initial work packet
[17:48:32] 
[17:48:32] Project: 2665 (Run 2, Clone 276, Gen 180)
[17:48:32] 
[17:48:32] Assembly optimizations on if available.
[17:48:32] Entering M.D.
[17:48:49] percent)
[17:48:49] - Starting from initial work packet
[17:48:49] 
[17:48:49] Project: 2665 (Run 2, Clone 276, Gen 180)
[17:48:49] 
[17:48:49] Entering M.D.

Re: Project: 2665 (Run 2, Clone 276, Gen 180) hangs

Posted: Tue Feb 09, 2010 7:19 pm
by toTOW
No one has been able to return this WU yet ...