Page 1 of 1

Not Getting Points on Incomplete WUs

Posted: Sun Aug 09, 2009 6:43 pm
by geokilla
I seem to be missing some points on incomplete WUs. The latest one came from:

Code: Select all

 # Windows SMP Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.24R3

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Folding@Home\CPU
Executable: C:\Folding@Home\CPU\Folding@home-Win32-x86.exe
Arguments: -smp -verbosity 9 

[15:52:18] - Ask before connecting: No
[15:52:18] - User name: geokilla (Team 38296)
[15:52:18] - User ID: 2D7F8F3D2B43B143
[15:52:18] - Machine ID: 1
[15:52:18] 
[15:52:18] Loaded queue successfully.
[15:52:18] 
[15:52:18] + Processing work unit
[15:52:18] Work type a1 not eligible for variable processors
[15:52:18] Core required: FahCore_a1.exe
[15:52:18] Core found.
[15:52:18] Using generic mpiexec calls
[15:52:18] - Autosending finished units... [August 8 15:52:18 UTC]
[15:52:18] Trying to send all finished work units
[15:52:18] + No unsent completed units remaining.
[15:52:18] - Autosend completed
[15:52:18] Working on queue slot 06 [August 8 15:52:18 UTC]
[15:52:18] + Working ...
[15:52:18] - Calling 'mpiexec -np 4 -channel auto -host 127.0.0.1 FahCore_a1.exe -dir work/ -suffix 06 -checkpoint 15 -verbose -lifeline 3636 -version 624'

[15:52:27] 
[15:52:27] *------------------------------*
[15:52:27] Folding@Home Gromacs SMP Core
[15:52:27] Version 1.74 (March 10, 2007)
[15:52:27] 
[15:52:27] Preparing to commence simulation
[15:52:27] - Ensuring status. Please wait.
[15:52:44] - Looking at optimizations...
[15:52:44] - Working with standard loops on this execution.
[15:52:44] - Previous termination of core was improper.
[15:52:44] - Going to use standard loops.
[15:52:44] - Files status OK
[15:52:47] - Expanded 2423332 -> 12912821 (decompressed 532.8 percent)
[15:52:54] 
[15:52:54] Project: 2653 (Run 2, Clone 64, Gen 148)
[15:52:54] 
[15:52:55] Entering M.D.
[15:53:33] Calling FAH init
[15:53:34] in POPC
[15:53:34] ology
[15:53:34] (Starting from checkpoint)
[15:53:34] Read checkpoint
[15:53:36] Protein: Protein in POPC
[15:53:36] Writing local files
[15:53:38] Completed 180000 out of 500000 steps  (36 percent)
[15:53:39] Extra SSE boost OK.
[16:08:41] Timered checkpoint triggered.
[16:16:12] Writing local files
[16:16:12] Completed 185000 out of 500000 steps  (37 percent)
[16:31:13] Timered checkpoint triggered.
[16:37:56] Writing local files
XXXXXXXXXXXXXXXXXXXXXXXXXXXXX (Folding fine)
[04:17:23] Timered checkpoint triggered.
[04:19:24] Writing local files
[04:19:24] Completed 355000 out of 500000 steps  (71 percent)
[04:34:24] Timered checkpoint triggered.
[04:38:14] Writing local files
[04:38:15] Completed 360000 out of 500000 steps  (72 percent)
[04:53:14] Timered checkpoint triggered.
[04:57:29] Writing local files
[04:57:29] Completed 365000 out of 500000 steps  (73 percent)
[05:12:32] Timered checkpoint triggered.
[05:17:39] CoreStatus = 40010004 (1073807364)
[05:17:39] Client-core communications error: ERROR 0x40010004
[05:17:39] Deleting current work unit & continuing...
[05:17:44] - Warning: Could not delete all work unit files (6): Core returned invalid code
[05:17:44] Trying to send all finished work units
[05:17:44] + No unsent completed units remaining.
[05:17:44] - Preparing to get new work unit...
[05:17:44] Cleaning up work directory
[05:17:45] + Attempting to get work packet
[05:17:45] - Will indicate memory of 2046 MB
[05:17:45] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 7, Stepping: 6
[05:17:45] - Connecting to assignment server
[05:17:45] Connecting to http://assign.stanford.edu:8080/


--- Opening Log file [August 9 05:19:57 UTC] 


# Windows SMP Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.24R3

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Folding@Home\CPU
Executable: C:\Folding@Home\CPU\Folding@home-Win32-x86.exe
Arguments: -smp -verbosity 9 

[05:19:57] - Ask before connecting: No
[05:19:57] - User name: geokilla (Team 38296)
[05:19:57] - User ID: 2D7F8F3D2B43B143
[05:19:57] - Machine ID: 1
[05:19:57]
[05:19:58] Loaded queue successfully.
[05:19:58] - Preparing to get new work unit...
[05:19:58] Cleaning up work directory
[05:19:58] + Attempting to get work packet
[05:19:58] - Will indicate memory of 2046 MB
[05:19:58] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 7, Stepping: 6
[05:19:58] - Connecting to assignment server
[05:19:58] Connecting to http://assign.stanford.edu:8080/
[05:19:58] - Autosending finished units... [August 9 05:19:58 UTC]
[05:19:58] Trying to send all finished work units
[05:19:58] + No unsent completed units remaining.
[05:19:58] - Autosend completed
[05:19:59] Posted data.
[05:19:59] Initial: 40AB; - Successful: assigned to (171.64.65.64).
[05:19:59] + News From Folding@Home: Welcome to Folding@Home
[05:19:59] Loaded queue successfully.
[05:19:59] Connecting to http://171.64.65.64:8080/
[05:20:07] Posted data.
[05:20:07] Initial: 0000; - Receiving payload (expected size: 2423844)
[05:20:09] - Downloaded at ~1183 kB/s
[05:20:09] - Averaged speed for that direction ~841 kB/s
[05:20:09] + Received work.
[05:20:09] + Closed connections
[05:20:09]
[05:20:09] + Processing work unit
[05:20:09] Work type a1 not eligible for variable processors
[05:20:09] Core required: FahCore_a1.exe
[05:20:09] Core found.
[05:20:09] Using generic mpiexec calls
[05:20:09] Working on queue slot 07 [August 9 05:20:09 UTC]
[05:20:09] + Working ...
[05:20:09] - Calling 'mpiexec -np 4 -channel auto -host 127.0.0.1 FahCore_a1.exe -dir work/ -suffix 07 -checkpoint 15 -verbose -lifeline 3500 -version 624'

[05:20:13] 
[05:20:13] *------------------------------*
[05:20:13] Folding@Home Gromacs SMP Core
[05:20:13] Version 1.74 (March 10, 2007)
[05:20:13] 
[05:20:13] Preparing to commence simulation
[05:20:13] - Ensuring status. Please wait.
[05:20:30] - Looking at optimizations...
[05:20:30] - Working with standard loops on this execution.
[05:20:30] - Previous termination of core was improper.
[05:20:30] - Files status OK
[05:20:31] ndard loops.
[05:20:31] - Files status OK
[05:20:34] - Expanded 2423332 -> 12912821 (decompressed 532.8 percent)
[05:20:34] - Starting from initial work packet
[05:20:34] 
[05:20:34] Project: 2653 (Run 2, Clone 64, Gen 148)
[05:20:34] 
[05:20:35] Entering M.D.
[05:20:41] Rejecting checkpoint
[05:20:42] Protein: Protein in POPC
[05:20:43] Writing local files
[05:20:45] Extra SSE boost OK.
[05:20:45] Writing local files
[05:20:45] Completed 0 out of 500000 steps  (0 percent)
[05:25:18] Killing all core threads
[05:25:18] Killing 4 cores
[05:25:18] Killing core 0
[05:25:18] Killing core 1
[05:25:18] Killing core 2
[05:25:18] Killing core 3

Folding@Home Client Shutdown at user request.
[05:25:18] ***** Got a SIGTERM signal (2)
[05:25:18] Killing all core threads
[05:25:18] Killing 4 cores
[05:25:18] Killing core 0
[05:25:18] Killing core 1
[05:25:18] Killing core 2
[05:25:18] Killing core 3

Folding@Home Client Shutdown.
I caused the WU to be incomplete cus I was doing something with Windows 7. As for the other incomplete WU that I didn't get points for, well I don't have the FahLog for them anymore.

Re: Not Getting Points on Incomplete WUs

Posted: Sun Aug 09, 2009 8:00 pm
by P5-133XL
To receive any partial points for a WU, something has to actually be returned to the servers. If you look at the log, you will see it crashed, and then was deleted before being sent. I'm not defending that particular behavior: I'm just telling you what happened from the log. To receive any credit, the words "Thank you for your contribution" must be present after the WU finishes and before a new WU is given. Those words are confirmation that the data was received by the collection servers.

You did not include enough of the log to find out your new WU, but I would suspect that it was the same WU that just crashed because the server still thinks that that WU is outstanding.

Re: Not Getting Points on Incomplete WUs

Posted: Sun Aug 09, 2009 10:04 pm
by geokilla
P5-133XL wrote:To receive any partial points for a WU, something has to actually be returned to the servers. If you look at the log, you will see it crashed, and then was deleted before being sent. I'm not defending that particular behavior: I'm just telling you what happened from the log. To receive any credit, the words "Thank you for your contribution" must be present after the WU finishes and before a new WU is given. Those words are confirmation that the data was received by the collection servers.

You did not include enough of the log to find out your new WU, but I would suspect that it was the same WU that just crashed because the server still thinks that that WU is outstanding.
That could be the case....Not too sure but it seems likely if it has to say "Thank you for your contribution."

Updated the code. Should have all you need to know to determine whether I should get partial points or not.

Re: Not Getting Points on Incomplete WUs

Posted: Mon Aug 10, 2009 5:43 am
by bruce
Partial credit is granted ONLY when the core is able to detect that the simulation caused the problem and to send partial results to the server. When the problem is caused by you (experimenting with 7 or whatever) or by faulty hardware (including unstable overclocking), or MIGHT have been caused by something that you could have avoided, no credit is granted. Many Unknown Errors give no credit, at least until the FahCore can be reprogrammed to catch the error.

The "Thank you ...." message indicates that the core was able to upload a result, whether it was partial or complete, and without that message there will be no credit.