Project: 3043 (Run 3, Clone 55, Gen 54) fails at 82% 0 (0)
Posted: Mon Mar 23, 2009 12:23 pm
This WU has failed twice at 82%, and has restarted on it's third
run. I've killed the bad WU and my machine (q7) has moved on to
something else.
model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
cpu MHz : 1596.000
cache size : 4096 KB
Memory: 1.94 GB physical, 1.94 GB virtual
...
Client Version 6.24beta
Core: FahCore_a1.exe
Core Version 1.74 (November 27, 2006)
Current Work Unit
-----------------
Name: p3043_p3029_SMP-emsv-03
Tag: P3043R3C55G54
run. I've killed the bad WU and my machine (q7) has moved on to
something else.
model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
cpu MHz : 1596.000
cache size : 4096 KB
Memory: 1.94 GB physical, 1.94 GB virtual
...
Client Version 6.24beta
Core: FahCore_a1.exe
Core Version 1.74 (November 27, 2006)
Current Work Unit
-----------------
Name: p3043_p3029_SMP-emsv-03
Tag: P3043R3C55G54
Code: Select all
[10:34:09] Project: 3043 (Run 3, Clone 55, Gen 54)
[10:34:09]
[10:34:09] Assembly optimizations on if available.
[10:34:09] Entering M.D.
[10:34:26] ial work pa- Starting from initial work packet
[10:34:26]
[10:34:26] Project: 3Entering M.D.
[10:34:26] one 55, Gen 54)
[10:34:26]
[10:34:26] Entering M.D.
[10:34:32] cal files
[10:34:32] Completed 0 out of 10000000 steps (0 percent)
[10:34:32] SSE boost OK.
[10:43:26] iles
[10:43:26] Completed 100000 out of 10000000 steps (1 percent)
[10:52:22] Completed 200000 out of 10000000 steps (2 percent)
... snip ...
[22:35:33] Completed 8100000 out of 10000000 steps (81 percent)
[22:44:26] Completed 8200000 out of 10000000 steps (82 percent)
[22:50:11] Warning: long 1-4 interactions
[22:50:15] CoreStatus = 0 (0)
[22:50:15] Sending work to server
[22:50:15] Project: 3043 (Run 3, Clone 55, Gen 54)
[22:50:15] - Error: Could not get length of results file work/wuresults_06.dat
[22:50:15] - Error: Could not read unit 06 file. Removing from queue.
[22:50:15] Trying to send all finished work units
[22:50:15] + No unsent completed units remaining.
[22:50:15] - Preparing to get new work unit...
[22:50:15] + Attempting to get work packet
[22:50:15] - Will indicate memory of 1985 MB
[22:50:15] - Connecting to assignment server
[22:50:15] Connecting to http://assign.stanford.edu:8080/
[22:50:16] Posted data.
[22:50:16] Initial: 40AB; - Successful: assigned to (171.64.65.63).
[22:50:16] + News From Folding@Home: Welcome to Folding@Home
[22:50:16] Loaded queue successfully.
[22:50:16] Connecting to http://171.64.65.63:8080/
[22:50:17] Posted data.
[22:50:17] Initial: 0000; - Receiving payload (expected size: 283317)
[22:50:17] Conversation time very short, giving reduced weight in bandwidth avg
[22:50:17] - Downloaded at ~553 kB/s
[22:50:17] - Averaged speed for that direction ~380 kB/s
[22:50:17] + Received work.
[22:50:17] Trying to send all finished work units
[22:50:17] + No unsent completed units remaining.
[22:50:17] + Closed connections
[22:50:22]
[22:50:22] + Processing work unit
[22:50:22] Work type a1 not eligible for variable processors
[22:50:22] Core required: FahCore_a1.exe
[22:50:22] Core found.
[22:50:22] Working on queue slot 07 [March 22 22:50:22 UTC]
[22:50:22] + Working ...
-version 624'
[22:50:22] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 07 -checkpoint 15 -verbose -lifeline 15441
[22:50:22]
[22:50:22] *------------------------------*
[22:50:22] Folding@Home Gromacs SMP Core
[22:50:22] Version 1.74 (November 27, 2006)
[22:50:22]
[22:50:22] Preparing to commence simulation
[22:50:22] - Ensuring status. Please wait.
[22:50:39] - Looking at optimizations...
[22:50:39] - Working with standard loops on this execution.
[22:50:39] - Previous termination of core was improper.
[22:50:39] - Going to use standard loops.
[22:50:39] - Files status OK
[22:50:39] - Expanded 282805 -> 1508541 (decompressed 533.4 percent)
[22:50:40] - Data doesn't match checksum.
[22:50:40] - Starting from initial work packet
[22:50:40]
[22:50:40] Project: 3043 (Run 3, Clone 55, Gen 54)
[22:50:40]
[22:50:40] Entering M.D.
[22:50:47] Protein: 9684 p3029_SProtein: 9684 p3029_SMP-emsv-03Extra SSE boost OK.
[22:50:47]
[22:50:47] Extra SSE boost OK.
[22:50:47] Completed 0 out of 10000000 steps (0 percent)
[22:59:43] Completed 100000 out of 10000000 steps (1 percent)
[23:08:42] Completed 200000 out of 10000000 steps (2 percent)
... snip ...
[10:51:54] Completed 8100000 out of 10000000 steps (81 percent)
[11:00:47] Completed 8200000 out of 10000000 steps (82 percent)
[11:06:34] Warning: long 1-4 interactions
[11:06:38] CoreStatus = 0 (0)
[11:06:38] Sending work to server
[11:06:38] Project: 3043 (Run 3, Clone 55, Gen 54)
[11:06:38] - Error: Could not get length of results file work/wuresults_07.dat
[11:06:38] - Error: Could not read unit 07 file. Removing from queue.
[11:06:38] Trying to send all finished work units
[11:06:38] + No unsent completed units remaining.