Page 1 of 1

Workaround for WU starvation on Manjaro Linux

Posted: Wed May 06, 2020 9:49 pm
by pcwolf
My first post, I could not locate similar behavior searching F@H forum.

F@H Client 7.6.9, GPU NVidia RTX 2070, kernel 5.6.8

I fold 24x7 and when I wake in the morning the F@H client is still chugging away and overnight stats show credits.

When I am awake at the machine, I keep an idle eye on progress and check in when WU completes at 99% and requests a new WU.
I am well aware that popular press has hugely increased the number of active Folders and understandably demand for WUs is immense these days. And considerable patience called for.

QUESTION:
When the log shows "No WU available for this configuration" I check the GPU for time of "Next attempt." It seems that the interval between requests *increases* as the number of Unavailable attempts continues. 1 minute, 2 minutes, 5 minutes, 10 minutes ... etc etc. Is this working as designed to protect the F@H servers by spacing out unfillable WU requests?

OBSERVATION:
The "No WU available for this configuration" pings back and forth between F@H servers 18.218.x.x and 65.254.x.x, then, when finally connecting, it downloads from Work Unit Server 3.188.x.x. I am at a loss to understand why this hierarchy is chosen, but doesn't really matter whether I understand or not.

WORKAROUND:
When I become impatient waiting for WU downloads (i.e. considerable minutes/hours passing not Folding) I have found if I go to Manjaro System Settings and go to the SystemD tab, I can restart the "foldingathome.service" and when both the service and F@H Client return ... *BOOM* I immediately receive a new WU. :D This behavior is consistent and repeatable. I have two GPUs Folding and the previously engaged slot goes immediately back to a checkpoint and resumes flawlessly.

I do not understand what is happening, but I do know it is happening. By the way, on Manjaro, the general installation is working flawlessly without hiccup, and took me about a half hour to install and bring up when I started Folding last month. The ARCH wiki has detailed step-by-step, and AUR updates Foldingathome regularly.

Re: Workaround for WU starvation on Manjaro Linux

Posted: Thu May 07, 2020 10:15 am
by PantherX
The next attempt uses an exponential timer. It was originally meant to deal with the situation where the Server had physical issues and needed manpower and lots of hours to fix. However, the latest version, 7.6.13 has the upper-limit to 1 hour so you can give that a go if you want.

The 18.218.x.x and 65.254.x.x are the Assignment Servers and will always be the first point of contact for the clients. The AS then directs your client to the best possible Work Server to get a WU.

Restarting the client resets the timer which means more attempts to get a WU leading to a higher probability of getting a WU. However, if the Servers are under load, this adds to the issue hence the 7.6.13 has the upper limit of 1 hour to remove the manual resetting of the timer.

Re: Workaround for WU starvation on Manjaro Linux

Posted: Thu May 07, 2020 10:29 am
by Neil-B
... and the current (possibly still beta but thought it had been released) FAHCLient 7.6.13 https://foldingathome.org/beta/ has I believe a changed profile for the retry timer with a maximum wait now being one hour rather than six - but I may be wrong on that (seem to recall it said something to this effect in the release notes).

Re: Workaround for WU starvation on Manjaro Linux

Posted: Sun May 10, 2020 1:40 am
by pcwolf
Thank you once again for the very sound advice, PantherX!

As I mentioned, the AUR updates F@HClient as soon as it is released; I got 7.6.13 as a regular update a few days ago.

Re: Workaround for WU starvation on Manjaro Linux

Posted: Sun May 10, 2020 2:08 am
by bruce
pcwolf wrote:When I become impatient waiting for WU downloads (i.e. considerable minutes/hours passing not Folding) I have found if I go to Manjaro System Settings and go to the SystemD tab, I can restart the "foldingathome.service" and when both the service and F@H Client return ... *BOOM* I immediately receive a new WU. :D This behavior is consistent and repeatable. I have two GPUs Folding and the previously engaged slot goes immediately back to a checkpoint and resumes flawlessly.
You may (or may not) be guilty of biased perception. Restarting the service does initiate a fresh attempt to get work rather than waiting up to an hour for the next automatic attempt, but I know of no reason why the restart would be any more likely to succeed than if the next attempt was initiated by the timer. It would seem most likely that the client simply says to the server "I/m asking for a new work unit for my hardware ( ... description)" rather than the request being equivalent to "I'm asking again for for a new work unit for my hardware ( ... description)" Why would the "again" message (if it's there) actually reduce your chances of getting a new assignment?