Project: 6013 (Run 0, Clone 142, Gen 121)

Moderators: Site Moderators, FAHC Science Team

Post Reply
artoar_11
Posts: 652
Joined: Sun Nov 22, 2009 8:42 pm
Hardware configuration: AMD R7 3700X @ 4.0 GHz; ASUS ROG STRIX X470-F GAMING; DDR4 2x8GB @ 3.0 GHz; GByte RTX 3060 Ti @ 1890 MHz; Fortron-550W 80+ bronze; Win10 Pro/64
Location: Bulgaria/Team #224497/artoar11_ALL_....

Project: 6013 (Run 0, Clone 142, Gen 121)

Post by artoar_11 »

I got this WU - 6013 (Run 0, Clone 142, Gen 121). My CPU (Q9400 @ 3400MHz) TPF for this WU - 05:30 min.
This is after 11 + minutes.

Code: Select all

[22:44:14] Thank you for your contribution to Folding@Home.
[22:44:14] + Number of Units Completed: 161

[22:44:19] Trying to send all finished work units
[22:44:19] + No unsent completed units remaining.
[22:44:19] - Preparing to get new work unit...
[22:44:19] Cleaning up work directory
[22:44:19] + Attempting to get work packet
[22:44:19] Passkey found
[22:44:19] - Will indicate memory of 2048 MB
[22:44:19] - Connecting to assignment server
[22:44:19] Connecting to http://assign.stanford.edu:8080/
[22:44:23] Posted data.
[22:44:23] Initial: ED82; - Successful: assigned to (130.237.232.140).
[22:44:23] + News From Folding@Home: Welcome to Folding@Home
[22:44:23] Loaded queue successfully.
[22:44:23] Connecting to http://130.237.232.140:8080/
[22:44:25] Posted data.
[22:44:25] Initial: 0000; - Receiving payload (expected size: 978141)
[22:44:29] - Downloaded at ~238 kB/s
[22:44:29] - Averaged speed for that direction ~319 kB/s
[22:44:29] + Received work.
[22:44:29] Trying to send all finished work units
[22:44:29] + No unsent completed units remaining.
[22:44:29] + Closed connections
[22:44:29] 
[22:44:29] + Processing work unit
[22:44:29] Core required: FahCore_a3.exe
[22:44:29] Core found.
[22:44:29] Working on queue slot 09 [June 19 22:44:29 UTC]
[22:44:29] + Working ...
[22:44:29] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 09 -np 4 -checkpoint 6 -forceasm -verbose -lifeline 3384 -version 629'

[22:44:30] 
[22:44:30] *------------------------------*
[22:44:30] Folding@Home Gromacs SMP Core
[22:44:30] Version 2.19 (Mar 12, 2010)
[22:44:30] 
[22:44:30] Preparing to commence simulation
[22:44:30] - Assembly optimizations manually forced on.
[22:44:30] - Not checking prior termination.
[22:44:31] - Expanded 977629 -> 10427873 (decompressed 1066.6 percent)
[22:44:31] Called DecompressByteArray: compressed_data_size=977629 data_size=10427873, decompressed_data_size=10427873 diff=0
[22:44:31] - Digital signature verified
[22:44:31] 
[22:44:31] Project: 6013 (Run 0, Clone 142, Gen 121)
[22:44:31] 
[22:44:31] Assembly optimizations on if available.
[22:44:31] Entering M.D.
[22:44:51] Completed 0 out of 250000 steps  (0%)
[22:56:17] Killing all core threads
[22:56:17] Killing 3 cores
[22:56:17] Killing core 0
[22:56:17] Killing core 1
[22:56:17] Killing core 2

Folding@Home Client Shutdown at user request.
[22:56:17] ***** Got a SIGTERM signal (2)
[22:56:17] Killing all core threads
[22:56:17] Killing 3 cores
[22:56:17] Killing core 0
[22:56:17] Killing core 1
[22:56:17] Killing core 2

Folding@Home Client Shutdown.
I restart the client.

Code: Select all


--- Opening Log file [June 19 22:56:24 UTC] 


# Windows SMP Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.29

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: D:\1_SMP2_FAH
Executable: D:\1_SMP2_FAH\1-F@H-6.29Beta32-x86.exe
Arguments: -smp -deino -forceasm -verbosity 9 -advmethods -local 

[22:56:24] - Ask before connecting: No
[22:56:24] - User name: artoar_home (Team 32435)
[22:56:24] - User ID: 714971F37559BC17
[22:56:24] - Machine ID: 1
[22:56:24] 
[22:56:24] Loaded queue successfully.
[22:56:24] 
[22:56:24] - Autosending finished units... [June 19 22:56:24 UTC]
[22:56:24] + Processing work unit
[22:56:24] Trying to send all finished work units
[22:56:24] Core required: FahCore_a3.exe
[22:56:24] + No unsent completed units remaining.
[22:56:24] - Autosend completed
[22:56:24] Core found.
[22:56:24] Working on queue slot 09 [June 19 22:56:24 UTC]
[22:56:24] + Working ...
[22:56:24] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 09 -np 4 -checkpoint 6 -forceasm -verbose -lifeline 3704 -version 629'

[22:56:24] 
[22:56:24] *------------------------------*
[22:56:24] Folding@Home Gromacs SMP Core
[22:56:24] Version 2.19 (Mar 12, 2010)
[22:56:24] 
[22:56:24] Preparing to commence simulation
[22:56:24] - Ensuring status. Please wait.
[22:56:33] - Assembly optimizations manually forced on.
[22:56:33] - Not checking prior termination.
[22:56:34] - Expanded 977629 -> 10427873 (decompressed 1066.6 percent)
[22:56:34] Called DecompressByteArray: compressed_data_size=977629 data_size=10427873, decompressed_data_size=10427873 diff=0
[22:56:34] - Digital signature verified
[22:56:34] 
[22:56:34] Project: 6013 (Run 0, Clone 142, Gen 121)
[22:56:34] 
[22:56:34] Assembly optimizations on if available.
[22:56:34] Entering M.D.
[22:56:40] Using Gromacs checkpoints
[22:56:42] Resuming from checkpoint
[22:56:42] Verified work/wudata_09.log
[22:56:42] Verified work/wudata_09.trr
[22:56:42] Verified work/wudata_09.xtc
[22:56:42] Verified work/wudata_09.edr
[22:56:55] Completed 278 out of 250000 steps  (0%)
[23:04:19] Killing all core threads
[23:04:19] Killing 3 cores
[23:04:19] Killing core 0
[23:04:19] Killing core 1
[23:04:19] Killing core 2

Folding@Home Client Shutdown at user request.
[23:04:19] ***** Got a SIGTERM signal (2)
[23:04:19] Killing all core threads
[23:04:19] Killing 3 cores
[23:04:19] Killing core 0
[23:04:19] Killing core 1
[23:04:19] Killing core 2

Folding@Home Client Shutdown.
Completed 278 out of 250000 steps (0%). Slightly more than 10% (of TPF) for 11 + minutes.
Restart

Code: Select all

--- Opening Log file [June 19 23:04:26 UTC] 


# Windows SMP Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.29

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: D:\1_SMP2_FAH
Executable: D:\1_SMP2_FAH\1-F@H-6.29Beta32-x86.exe
Arguments: -smp -deino -forceasm -verbosity 9 -advmethods -local 

[23:04:26] - Ask before connecting: No
[23:04:26] - User name: artoar_home (Team 32435)
[23:04:26] - User ID: 714971F37559BC17
[23:04:26] - Machine ID: 1
[23:04:26] 
[23:04:26] Loaded queue successfully.
[23:04:26] 
[23:04:26] - Autosending finished units... [June 19 23:04:26 UTC]
[23:04:26] + Processing work unit
[23:04:26] Trying to send all finished work units
[23:04:26] Core required: FahCore_a3.exe
[23:04:26] + No unsent completed units remaining.
[23:04:26] - Autosend completed
[23:04:26] Core found.
[23:04:26] Working on queue slot 09 [June 19 23:04:26 UTC]
[23:04:26] + Working ...
[23:04:26] - Calling '.\FahCore_a3.exe -dir work/ -nice 19 -suffix 09 -np 4 -checkpoint 6 -forceasm -verbose -lifeline 1484 -version 629'

[23:04:26] 
[23:04:26] *------------------------------*
[23:04:26] Folding@Home Gromacs SMP Core
[23:04:26] Version 2.19 (Mar 12, 2010)
[23:04:26] 
[23:04:26] Preparing to commence simulation
[23:04:26] - Ensuring status. Please wait.
[23:04:36] - Assembly optimizations manually forced on.
[23:04:36] - Not checking prior termination.
[23:04:37] - Expanded 977629 -> 10427873 (decompressed 1066.6 percent)
[23:04:37] Called DecompressByteArray: compressed_data_size=977629 data_size=10427873, decompressed_data_size=10427873 diff=0
[23:04:37] - Digital signature verified
[23:04:37] 
[23:04:37] Project: 6013 (Run 0, Clone 142, Gen 121)
[23:04:37] 
[23:04:37] Assembly optimizations on if available.
[23:04:37] Entering M.D.
[23:04:43] Using Gromacs checkpoints
[23:04:45] Resuming from checkpoint
[23:04:45] Verified work/wudata_09.log
[23:04:45] Verified work/wudata_09.trr
[23:04:45] Verified work/wudata_09.xtc
[23:04:45] Verified work/wudata_09.edr
[23:04:57] Completed 558 out of 250000 steps  (0%)
[23:12:48] Killing all core threads
[23:12:48] Killing 3 cores
[23:12:48] Killing core 0
[23:12:48] Killing core 1
[23:12:48] Killing core 2

Folding@Home Client Shutdown at user request.
[23:12:48] ***** Got a SIGTERM signal (2)
[23:12:48] Killing all core threads
[23:12:48] Killing 3 cores
[23:12:48] Killing core 0
[23:12:48] Killing core 1
[23:12:48] Killing core 2

Folding@Home Client Shutdown.
What to do with this WU? Possibly delete? (In Task Manager graph in red is ~ 2 / 3 in height).
Sorry for bad English.
Bob8421
Posts: 53
Joined: Tue Dec 22, 2009 5:16 pm

Re: Project: 6013 (Run 0, Clone 142, Gen 121)

Post by Bob8421 »

There are a number of work units in project 6013 that, for reasons unknown, run 10 times longer than most of the others. I've had a few of them and other users have reported the exact same thing. In my case the time per 1% step went from 7.5 minutes to 75 minutes, meaning that they could not possibly finish in time. I deleted them because they were just wasting computing time that could be used for work units that would complete properly.

Despite numerous reports, as far as I know PG is not looking into these work units to find out what is "wrong" with them and to delete them all so they don't keep getting sent out.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 6013 (Run 0, Clone 142, Gen 121)

Post by bruce »

I've seen the reports that you mention. What I do NOT know is whether we're talking about all WUs from p6013 or just a few. Nobody makes a report if all they're getting is good WUs from that project.

I've reported (P6013 R0 C142 G121) as a bad WU which will suspend it from being reissued, but I'd really like to get to the bottom of the problem.

artoar: Is that computer dedicated to FAH or are other things running on it? (especially another FAH client or BOINC or something else that processes continuously) Have you configured FAH to disable affinity locking? We do know that with the default settings, the SMP client will make almost no progress if some other heavy task is running and there are ways to deal with that issue.

What hardware do you have?
shdbcamping
Posts: 81
Joined: Mon Nov 10, 2008 7:57 am
Hardware configuration: XPS 720 Q6600 9800GX2 3gig RAM
750W primary PSU 650W Aux VGA PSU

Re: Project: 6013 (Run 0, Clone 142, Gen 121)

Post by shdbcamping »

bruce wrote:I've seen the reports that you mention. What I do NOT know is whether we're talking about all WUs from p6013 or just a few. Nobody makes a report if all they're getting is good WUs from that project.

I've reported (P6013 R0 C142 G121) as a bad WU which will suspend it from being reissued, but I'd really like to get to the bottom of the problem.

artoar: Is that computer dedicated to FAH or are other things running on it? (especially another FAH client or BOINC or something else that processes continuously) Have you configured FAH to disable affinity locking? We do know that with the default settings, the SMP client will make almost no progress if some other heavy task is running and there are ways to deal with that issue.

What hardware do you have?
It is apparrently not the majority based on the complaints, this much I can agree with. The problem comes with the 'project' in general. It is extremely problematic for the full spectrum of OS's and HW.

It would be a good idea, JIMO ...YMMV, if the whole project was "PULLED" until the actual problem was identified and fixed. Again JIMO, it is not a 'core' issue as these complaints come frome Donors that are NOT having problems with any other WUs.

We have better things to do than babysit badly written WU's.
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Project: 6013 (Run 0, Clone 142, Gen 121)

Post by PantherX »

bruce wrote:I've seen the reports that you mention. What I do NOT know is whether we're talking about all WUs from p6013 or just a few. Nobody makes a report if all they're getting is good WUs from that project.
I have made a thread here and will be including all the good and bad WUs: (http://foldingforum.org/viewtopic.php?f ... 90#p148390)
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
artoar_11
Posts: 652
Joined: Sun Nov 22, 2009 8:42 pm
Hardware configuration: AMD R7 3700X @ 4.0 GHz; ASUS ROG STRIX X470-F GAMING; DDR4 2x8GB @ 3.0 GHz; GByte RTX 3060 Ti @ 1890 MHz; Fortron-550W 80+ bronze; Win10 Pro/64
Location: Bulgaria/Team #224497/artoar11_ALL_....

Re: Project: 6013 (Run 0, Clone 142, Gen 121)

Post by artoar_11 »

Hi bruce.
Six months doing nothing on this computer unless FAH and net surfing. Does not even work with video card. Only CPU - A3 WUs.
My hardware:
CPU - Q9400 @ 3400MHz (Tmax. - 66*C)
MB - Asus P5Q SE PLUS
MEM - 2x1 GB Kingston Hyper X-8500 @ 850MHz/2-4-4-4-12 / DDR2
VC - GF 7300GT
HDD - Hitachi 80GB SATA/8MB
PSU-Fortron-350W PNF
OS - Win XP & SP3
More than two months I had no problem with WUs. Only with 6013 CPU usually makes about 70*C. This WU (Run 0, Clone 142, Gen 121) temperature was only 61-62*C. I look and the red graph in Task Manager. This WU it was very high - 2/3 of max. Normal is 2-3 replicates. I noticed that the lower red graphics - faster task is calculated.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 6013 (Run 0, Clone 142, Gen 121)

Post by bruce »

I beleive you've just identified the problem (or, if there are more than one problem, one very significant one). The red graph represents overhead, which is what Windows has to waste to get the various programs to run. You're saying that clearly 6013 is running inefficiently.

Why should your OS overhead go up? The first place to look is if your system is thrashing -- which, putting it another way, means your system doesn't have enough RAM to do everything it needs to.

OK, l'm going to assume that you're only running FAH and it's only running p6013. If that's true, then I suspect that your 2GB isn't enough for p6013 by itself.

OK, now back to my earlier question about who is having problems and who isn't. Those of you who have complained about 6013 running exceptionally slow -- I'll bet you have 2GB or less. Those of you who have said it runs ok on your machine, I'll bet you have 3GB or more. Check the overhead graph in TaskMan and let us know what you see when you reply.

If I'm right, the problem can be fixed. If I'm wrong, I'm sure we can find another theory. In either case, your reports are important to fixing this problem.
bruce
Posts: 20824
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 6013 (Run 0, Clone 142, Gen 121)

Post by bruce »

shdbcamping wrote:
bruce wrote:What hardware do you have?
It is apparrently not the majority based on the complaints, this much I can agree with. The problem comes with the 'project' in general. It is extremely problematic for the full spectrum of OS's and HW.

It would be a good idea, JIMO ...YMMV, if the whole project was "PULLED" until the actual problem was identified and fixed. Again JIMO, it is not a 'core' issue as these complaints come frome Donors that are NOT having problems with any other WUs.

We have better things to do than babysit badly written WU's.
The fact that the WU runs successfully in the lab doesn't prove that it will run successfully on your machine, so it's not fair to designate it as a "badly written WU." Things like this are supposed to be discovered during beta testing an most of them are, but apparently the beta test folks didn't have whatever combinations of HW/OS that it takes to create the problem. In any case, somebody has to be smart enough to figure out the difference between machines which have problems and machines which do not. Just sending it back to the lab or to the beta testers for more testing will not be successful because (apparently) they didn't run into a problem that needed to be fixed.

(I can pull individual WUs out of circulation -- they don't let me decide to pull an entire project and send it back to testing.)

Feel free to help figure out which machines are successful and which are not and help identify the critical difference(s). You probably know as much or more about that process as anybody else. My theory (above) about RAM is just a theory until we gather more data and we can gather data to support your theory at the same time.
Post Reply