2665 (Run 2, Clone 159, Gen 87) Stuck @ 19%

Moderators: Site Moderators, FAHC Science Team

Post Reply
parkut
Posts: 365
Joined: Tue Feb 12, 2008 7:33 am
Hardware configuration: Running exclusively Linux headless blades. All are dedicated crunching machines.
Location: SE Michigan, USA

2665 (Run 2, Clone 159, Gen 87) Stuck @ 19%

Post by parkut »

Maybe a bad WU?

CPU utilization drops to zero. Restarted, and 11
minutes later, process shutdown with Warning: long 1-4 interactions
notice. Restarted again, and shutdown again 11 minutes later.


model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz
cpu MHz : 2400.073
cache size : 4096 KB
Memory: 1.96 GB physical, 1023.99 MB virtual
...
Client Version 6.24beta
Core: FahCore_a1.exe
Core Version 1.74 (November 27, 2006)
Current Work Unit
-----------------
Name: p2665_IBX in water
Tag: P2665R2C159G87
Download time: February 12 21:38:21
Due time: February 18 21:38:21
Progress: 19% [|_________]
...
[0]0:Return code = 102
[0]1:Return code = 0, signaled with Segmentation fault
[0]2:Return code = 0, signaled with Segmentation fault
[0]3:Return code = 0, signaled with Segmentation fault
...

Code: Select all

[03:05:33] Completed 42500 out of 250000 steps  (17 percent)
[03:20:33] Timered checkpoint triggered.
[03:24:43] Writing local files
[03:24:43] Completed 45000 out of 250000 steps  (18 percent)
[03:39:43] Timered checkpoint triggered.
[03:43:56] Writing local files
[03:43:56] Completed 47500 out of 250000 steps  (19 percent)
[03:53:34] 
[03:53:34] Folding@home Core Shutdown: INTERRUPTED
      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2004, The GROMACS development team,
            check out http://www.gromacs.org for more information.

        This inclusion of Gromacs code in the Folding@Home Core is under
        a special license (see http://folding.stanford.edu/gromacs.html)
         specially granted to Stanford by the copyright holders. If you
          are interested in using Gromacs, visit www.gromacs.org where
                you can download a free version of Gromacs under
         the terms of the GNU General Public License (GPL) as published
       by the Free Software Foundation; either version 2 of the License,
                     or (at your option) any later version.

[03:53:38] CoreStatus = 66 (102)
[03:53:38] + Shutdown requested by user. Exiting.***** Got a SIGTERM signal (15)
[03:53:38] Killing all core threads

Folding@Home Client Shutdown.

Note: Please read the license agreement (fah6 -license). Further 
use of this software requires that you have read and accepted this agreement.

2 cores detected


--- Opening Log file [February 13 04:31:17 UTC] 


# Linux SMP Console Edition ###################################################
###############################################################################

                       Folding@Home Client Version 6.24beta

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /root/fah6
Executable: ./fah6
Arguments: -betateam -verbosity 9 -smp 

[04:31:17] - Ask before connecting: No
[04:31:17] - Proxy: 192.168.1.188:3128
[04:31:17] - User name: parkut (Team 4)
[04:31:17] - User ID: 3997F7DC622EE12D
[04:31:17] - Machine ID: 1
[04:31:17] 
[04:31:17] Loaded queue successfully.
[04:31:17] 
[04:31:17] + Processing work unit
[04:31:17] Work type a1 not eligible for variable processors
[04:31:17] Core required: FahCore_a1.exe
[04:31:17] Core found.
[04:31:17] - Autosending finished units... [February 13 04:31:17 UTC]
[04:31:17] Trying to send all finished work units
[04:31:17] + No unsent completed units remaining.
[04:31:17] - Autosend completed
[04:31:17] Working on queue slot 04 [February 13 04:31:17 UTC]
[04:31:17] + Working ...
-version 624'
[04:31:17] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 04 -checkpoint 15 -verbose -lifeline 5881 

[04:31:17] 
[04:31:17] *------------------------------*
[04:31:17] Folding@Home Gromacs SMP Core
[04:31:17] Version 1.74 (November 27, 2006)
[04:31:17] 
[04:31:17] Preparing to commence simulation
[04:31:17] - Ensuring status. Please wait.
[04:31:18] 
[04:31:18] Project: 2665 (Run 2, Clone 159, Gen 87)
[04:31:18] 
[04:31:18] Assembly optimizations on if available.
[04:31:18] Entering M.D.
[04:31:35] - Expanded 4817194 -> 24810145 (decompressed 515.0 percent)
[04:31:35] 
[04:31:35] Project: 2665 (Run 2, Clone 159, Gen 87)
[04:31:35] 
[04:31:35] Entering M.D.
[04:31:41] Calling FAH init
[04:31:42] Read topology
[04:31:43]  47500 out of 250000 sCompleted 47500 out of 250000 steps  (19 percent)
[04:31:43] ons
[04:31:43] Writing local files
[04:31:43] Completed 47500 out of 250000 steps  (19 percent)
[04:31:44] Extra SSE boost OK.
[04:42:25] Warning:  long 1-4 interactions
[04:42:25] 
[04:42:25] Folding@home Core Shutdown: INTERRUPTED
      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2004, The GROMACS development team,
            check out http://www.gromacs.org for more information.

        This inclusion of Gromacs code in the Folding@Home Core is under
        a special license (see http://folding.stanford.edu/gromacs.html)
         specially granted to Stanford by the copyright holders. If you
          are interested in using Gromacs, visit www.gromacs.org where
                you can download a free version of Gromacs under
         the terms of the GNU General Public License (GPL) as published
       by the Free Software Foundation; either version 2 of the License,
                     or (at your option) any later version.

[04:42:29] CoreStatus = 66 (102)
[04:42:29] + Shutdown requested by user. Exiting.***** Got a SIGTERM signal (15)
[04:42:29] Killing all core threads

Folding@Home Client Shutdown.

Note: Please read the license agreement (fah6 -license). Further 
use of this software requires that you have read and accepted this agreement.

2 cores detected


--- Opening Log file [February 13 05:31:16 UTC] 


# Linux SMP Console Edition ###################################################
###############################################################################

                       Folding@Home Client Version 6.24beta

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /root/fah6
Executable: ./fah6
Arguments: -betateam -verbosity 9 -smp 

[05:31:16] - Ask before connecting: No
[05:31:16] - Proxy: 192.168.1.188:3128
[05:31:16] - User name: parkut (Team 4)
[05:31:16] - User ID: 3997F7DC622EE12D
[05:31:16] - Machine ID: 1
[05:31:16] 
[05:31:16] Loaded queue successfully.
[05:31:16] 
[05:31:16] + Processing work unit
[05:31:16] Work type a1 not eligible for variable processors
[05:31:16] Core required: FahCore_a1.exe
[05:31:16] Core found.
[05:31:16] - Autosending finished units... [February 13 05:31:16 UTC]
[05:31:16] Trying to send all finished work units
[05:31:16] + No unsent completed units remaining.
[05:31:16] - Autosend completed
[05:31:16] Working on queue slot 04 [February 13 05:31:16 UTC]
[05:31:16] + Working ...
-version 624'
[05:31:16] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 04 -checkpoint 15 -verbose -lifeline 6966 

[05:31:16] 
[05:31:16] *------------------------------*
[05:31:16] Folding@Home Gromacs SMP Core
[05:31:16] Version 1.74 (November 27, 2006)
[05:31:16] 
[05:31:16] Preparing to commence simulation
[05:31:16] - Ensuring status. Please wait.
[05:31:17] 
[05:31:17] Project: 2665 (Run 2, Clone 159, Gen 87)
[05:31:17] 
[05:31:17] Assembly optimizations on if available.
[05:31:17] Entering M.D.
[05:31:34] - Expanded 4817194 -> 24810145 (decompressed 515.0 percent)
[05:31:34] 
[05:31:34] Project: 2665 (Run 2, Clone 159, Gen 87)
[05:31:34] 
[05:31:34] Entering M.D.
[05:31:42] cal files
[05:31:42] Completed 47500 out of 250000 steps  (19 peCompleted 47500 out of 250000 steps  (19 percent)
[05:31:42] Extra SSE boost OK.
[05:42:24] me Core Shutdown: INTERRUPTED
      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2004, The GROMACS development team,
            check out http://www.gromacs.org for more information.

        This inclusion of Gromacs code in the Folding@Home Core is under
        a special license (see http://folding.stanford.edu/gromacs.html)
         specially granted to Stanford by the copyright holders. If you
          are interested in using Gromacs, visit www.gromacs.org where
                you can download a free version of Gromacs under
         the terms of the GNU General Public License (GPL) as published
       by the Free Software Foundation; either version 2 of the License,
                     or (at your option) any later version.

[05:42:28] CoreStatus = 66 (102)
[05:42:28] + Shutdown requested by user. Exiting.***** Got a SIGTERM signal (15)
[05:42:28] Killing all core threads

Folding@Home Client Shutdown.
parkut
Posts: 365
Joined: Tue Feb 12, 2008 7:33 am
Hardware configuration: Running exclusively Linux headless blades. All are dedicated crunching machines.
Location: SE Michigan, USA

Re: 2665 (Run 2, Clone 159, Gen 87) Stuck @ 19%

Post by parkut »

After getting stuck with no further progress six (6) times, and receiving the CoreStatus = 66 (102) error, I deleted queue.dat and the work folder contents. Restarted, and received the same WU. If it fails at the same point again, I won't wait to delete it. This may be a bad WU.
toTOW
Site Moderator
Posts: 6433
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: 2665 (Run 2, Clone 159, Gen 87) Stuck @ 19%

Post by toTOW »

There are 5 reports for partial credit in the DB ... no one was able to complete it yet.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
parkut
Posts: 365
Joined: Tue Feb 12, 2008 7:33 am
Hardware configuration: Running exclusively Linux headless blades. All are dedicated crunching machines.
Location: SE Michigan, USA

Re: 2665 (Run 2, Clone 159, Gen 87) Stuck @ 19%

Post by parkut »

Got it again!

This is a bad wu. Runs normally to 19% and then gets
stuck at 19% and won't continue despite repeated attempts,
exits with CoreStatus = 66 (102). Nuked it, and processed
two 2669's without issue. Was again assigned this wu, with
the same results.

model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz
cpu MHz : 2400.073
cache size : 4096 KB
Memory: 1.96 GB physical, 1023.99 MB virtual
...
Client Version 6.24beta
Core: FahCore_a1.exe
Core Version 1.74 (November 27, 2006)
Current Work Unit
-----------------
Name: p2665_IBX in water
Tag: P2665R2C159G87
Download time: February 17 18:31:39
Due time: February 23 18:31:39
Progress: 19% [|_________]
...
Project: 2665 (Run 2, Clone 159, Gen 87)

Code: Select all

[18:31:39] + Processing work unit
[18:31:39] Work type a1 not eligible for variable processors
[18:31:39] Core required: FahCore_a1.exe
[18:31:39] Core found.
[18:31:39] Working on queue slot 04 [February 17 18:31:39 UTC]
[18:31:39] + Working ...
-version 624'
[18:31:39] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 04 -checkpoint 15 -verbose -lifeline 22614 

[18:31:39] 
[18:31:39] *------------------------------*
[18:31:39] Folding@Home Gromacs SMP Core
[18:31:39] Version 1.74 (November 27, 2006)
[18:31:39] 
[18:31:39] Preparing to commence simulation
[18:31:39] - Ensuring status. Please wait.
[18:31:39] Created dyn
[18:31:39] - Files status OK
[18:31:40] - Expanded 4817194 -> 24810145 (decompressed 515.0 percent)
[18:31:40] - Starting from initial work packet
[18:31:40] 
[18:31:40] Project: 2665 (Run 2, Clone 159, Gen 87)
[18:31:40] 
[18:31:40] Assembly optimizations on if available.
[18:31:40] Entering M.D.
[18:31:57] - Starting from initial work packet
[18:31:57] 
[18:31:57] Project: 2665 (Run 2, Clone 159, Gen 87)
[18:31:57] 
[18:31:57] Entering M.D.
[18:32:05] Protein: HGG with glycosylations
[18:32:05] Writing local files
[18:32:06] Extra SSE boost OK.
[18:51:21] Completed 2500 out of 250000 steps  (1 percent)
[19:10:31] Completed 5000 out of 250000 steps  (2 percent)
[19:29:43] Completed 7500 out of 250000 steps  (3 percent)
[19:48:58] Completed 10000 out of 250000 steps  (4 percent)
[20:08:10] Completed 12500 out of 250000 steps  (5 percent)
[20:27:22] Completed 15000 out of 250000 steps  (6 percent)
[20:46:34] Completed 17500 out of 250000 steps  (7 percent)
[21:05:48] Completed 20000 out of 250000 steps  (8 percent)
[21:25:03] Completed 22500 out of 250000 steps  (9 percent)
[21:44:18] Completed 25000 out of 250000 steps  (10 percent)
[22:03:33] Completed 27500 out of 250000 steps  (11 percent)
[22:22:52] Completed 30000 out of 250000 steps  (12 percent)
[22:42:10] Completed 32500 out of 250000 steps  (13 percent)
[23:01:30] Completed 35000 out of 250000 steps  (14 percent)
[23:20:47] Completed 37500 out of 250000 steps  (15 percent)
[23:40:04] Completed 40000 out of 250000 steps  (16 percent)
[23:59:18] Completed 42500 out of 250000 steps  (17 percent)
[00:18:32] Completed 45000 out of 250000 steps  (18 percent)
[00:37:45] Completed 47500 out of 250000 steps  (19 percent)
[00:47:23] 
[00:47:23] Folding@home Core Shutdown: INTERRUPTED
      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2004, The GROMACS development team,
            check out http://www.gromacs.org for more information.

        This inclusion of Gromacs code in the Folding@Home Core is under
        a special license (see http://folding.stanford.edu/gromacs.html)
         specially granted to Stanford by the copyright holders. If you
          are interested in using Gromacs, visit www.gromacs.org where
                you can download a free version of Gromacs under
         the terms of the GNU General Public License (GPL) as published
       by the Free Software Foundation; either version 2 of the License,
                     or (at your option) any later version.

[00:47:27] CoreStatus = 66 (102)
[00:47:27] + Shutdown requested by user. Exiting.***** Got a SIGTERM signal (15)
[00:47:27] Killing all core threads

Folding@Home Client Shutdown.
toTOW
Site Moderator
Posts: 6433
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: 2665 (Run 2, Clone 159, Gen 87) Stuck @ 19%

Post by toTOW »

I've marked it as bad.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
parkut
Posts: 365
Joined: Tue Feb 12, 2008 7:33 am
Hardware configuration: Running exclusively Linux headless blades. All are dedicated crunching machines.
Location: SE Michigan, USA

Re: 2665 (Run 2, Clone 159, Gen 87) Stuck @ 19%

Post by parkut »

I was assigned it again. Again it got stuck at 19%. Despite repeated
attempts to get going beyond that point, always with the failure.
I've nuked it - again

Code: Select all

[07:40:56] + Number of Units Completed: 367

[07:40:59] - Warning: Could not delete all work unit files (1): Core file absent
[07:40:59] Trying to send all finished work units
[07:40:59] + No unsent completed units remaining.
[07:40:59] - Preparing to get new work unit...
[07:40:59] + Attempting to get work packet
[07:40:59] - Will indicate memory of 2011 MB
[07:40:59] - Connecting to assignment server
[07:40:59] Connecting to http://assign.stanford.edu:8080/
[07:40:59] Posted data.
[07:40:59] Initial: 40AB; - Successful: assigned to (171.64.65.64).
[07:40:59] + News From Folding@Home: Welcome to Folding@Home
[07:40:59] Loaded queue successfully.
[07:40:59] Connecting to http://171.64.65.64:8080/
[07:41:05] Posted data.
[07:41:05] Initial: 0000; - Receiving payload (expected size: 4817706)
[07:41:18] - Downloaded at ~361 kB/s
[07:41:18] - Averaged speed for that direction ~377 kB/s
[07:41:18] + Received work.
[07:41:18] Trying to send all finished work units
[07:41:18] + No unsent completed units remaining.
[07:41:18] + Closed connections
[07:41:18] 
[07:41:18] + Processing work unit
[07:41:18] Work type a1 not eligible for variable processors
[07:41:18] Core required: FahCore_a1.exe
[07:41:18] Core found.
[07:41:18] Working on queue slot 02 [February 19 07:41:18 UTC]
[07:41:18] + Working ...
-version 624'
[07:41:18] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 02 -checkpoint 15 -verbose -lifeline 16285 

[07:41:18] 
[07:41:18] *------------------------------*
[07:41:18] Folding@Home Gromacs SMP Core
[07:41:18] Version 1.74 (November 27, 2006)
[07:41:18] 
[07:41:18] Preparing to commence simulation
[07:41:18] - Ensuring status. Please wait.
[07:41:19] - Starting from initial work packet
[07:41:19] 
[07:41:19] Project: 2665 (Run 2, Clone 159, Gen 87)
[07:41:19] 
[07:41:19] Assembly optimizations on if available.
[07:41:19] Entering M.D.
[07:41:36]  percent)
[07:41:36] - Starting from initial work packet
[07:41:36] 
[07:41:36] Project: 2665 (Run 2, Clone 159, Gen 87)
[07:41:36] 
[07:41:36] Entering M.D.
[07:41:44] Protein: HGG with glycosylations
[07:41:44] Writing local files
[07:41:44] Extra SSE boost OK.
[07:41:45] cal files
[07:41:46] Completed 0 out of 250000 steps  (0 percent)
[07:56:46] Timered checkpoint triggered.
[08:01:02] Writing local files
[08:01:02] Completed 2500 out of 250000 steps  (1 percent)
Russ_64
Posts: 47
Joined: Wed Dec 05, 2007 4:31 pm
Hardware configuration: Dual Xeon E5645 (12C/24T) / 24Gb DDR3 - VMware ESXi 6.7.0
FAH v7.5.1
Location: London, UK

Re: 2665 (Run 2, Clone 159, Gen 87) Stuck @ 19%

Post by Russ_64 »

@toTOW - can you please check my results for 2665 WU's - I have completed quite a few in recent weeks but I can't see that 1920 points has been awarded for any single WU's ?

Is there a way we can see details of WU's credited - the XOC and other stats sites only aggregate results?
Thanks.
ImageImageImage
toTOW
Site Moderator
Posts: 6433
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: 2665 (Run 2, Clone 159, Gen 87) Stuck @ 19%

Post by toTOW »

Without the WU references (project, run, clone, gen), I can't do anything.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Post Reply