Page 1 of 2

18224 will not make the timeout (Radeon 780M)

Posted: Sun Apr 27, 2025 3:51 am
by arisu
For the first time on this device, I've recieved an 0x23 project on my Radeon 780M (species 7). I am folding P18224 R368 C12 G18. The performance is extremely low. PPD is 79k on what usually hits around 300k. ETA is 2d 13h, timeout is 1d 22h. The project is not beta.

The WU is also using 95.7% of available VRAM.

Re: 18224 will not make the timeout (Radeon 780M)

Posted: Sun Apr 27, 2025 6:55 am
by arisu
I may be able to get it to finish in time but only barely. I locked my GPU clocks to the maximum (800 MHz memory and 2.8 GHz shader) and locked CPU to 544 MHz to reduce the thermal impact it has on the iGPU. I also set the GPU folding thread's scheduler to a real-time priority (SCHED_FIFO) and turned off my desktop environment to limit graphics use. I wrote a quick script to toggle between high-CPU low-GPU when core23 does self-tests and low-CPU high-GPU when folding:

Code: Select all

#!/bin/bash

pid=$(pidof FahCore_23)
dpm="/sys/class/drm/card0/device/power_dpm_force_state"

chrt -afp 1 $pid
echo high > $dpm

tail --pid $pid -n 0 -f /proc/$pid/cwd/science.log | while read line; do
        if [ "$line" = "Running tests" ]; then
                echo low > $dpm
        elif [ "$line" = "All tests passed." ]; then
                echo high > $dpm
        fi
done

echo auto > $dpm
All of this together might, just might, let this WU pass the timeout. Either way, it probably should not be getting assigned to a 780M (and the 760M and 740M would have no hope of completing this).

Re: 18224 will not make the timeout (Radeon 780M)

Posted: Sun Apr 27, 2025 7:17 am
by BobWilliams757
Considering that many people don't fold 24/7, and they don't want to constantly tweak their systems in hopes of meeting timeouts, it seems that either the project assignment was too broad or that the species bump created the issue.

Being towards the bottom of a species almost ensures that you will often get projects that run for very long times. Been there, done that.

Re: 18224 will not make the timeout (Radeon 780M)

Posted: Sun Apr 27, 2025 8:33 am
by muziqaz
Arisu, now I hope you understand why 780M was species 3 in the beginning ;)
Finish the WU and accept occasional projects which might run horrible on opencl in Linux on your GPU.
Alternative is: I can always put it back to species 3 ;)

Re: 18224 will not make the timeout (Radeon 780M)

Posted: Sun Apr 27, 2025 10:23 am
by arisu
Species 4 apparently (I was mistaken about it being species 3). How about dropping it from 7 to 6 or 5? I was surprised that it was bumped from 4 all the way to 7. That may have been a bit much. Or perhaps the project should just have its constraints for AMD increased to 8?

Does 18224 get sent to species 5 and 6?

I heard that the v7 client is able to blacklist certain projects. So that means the servers support that. How is that done? Is it as simple as not contacting the work server to accept the work after the assignment server has given the client an assignment?

Re: 18224 will not make the timeout (Radeon 780M)

Posted: Sun Apr 27, 2025 10:35 am
by muziqaz
6 and 5 is reserved for GCN architecture only. If GCN arch GPUs crap out with certain projects I want to be able to excluded them and them only from that project ;)

v7 cannot blacklist certain projects ;)

P.S. once HIP goes live, your GPU will be truly worthy of species 7 designation :D :D

Re: 18224 will not make the timeout (Radeon 780M)

Posted: Sun Apr 27, 2025 10:38 am
by arisu
Got it.

Could the species constraints for 18224 be adjusted for AMD, then?

I'm currently folding two WUs that are unlikely to make the timeout, one on the Radeon 780M and one on the GTX 770M (the same one as last time).

Re: 18224 will not make the timeout (Radeon 780M)

Posted: Sun Apr 27, 2025 10:41 am
by muziqaz
just to be clear gtx770M is on another project?

Re: 18224 will not make the timeout (Radeon 780M)

Posted: Sun Apr 27, 2025 10:45 am
by arisu
muziqaz wrote: Sun Apr 27, 2025 10:41 am just to be clear gtx770M is on another project?
Yes. It is on 12705. I had thought you said the species constraints had been adjusted, but maybe something slipped through the cracks.

Re: 18224 will not make the timeout (Radeon 780M)

Posted: Sun Apr 27, 2025 10:47 am
by muziqaz
arisu wrote: Sun Apr 27, 2025 10:45 am
muziqaz wrote: Sun Apr 27, 2025 10:41 am just to be clear gtx770M is on another project?
Yes. It is on 12705. I had thought you said the species constraints had been adjusted, but maybe something slipped through the cracks.
Yeah, constraints for 12705 have issues.

Re: 18224 will not make the timeout (Radeon 780M)

Posted: Sun Apr 27, 2025 10:48 am
by arisu
Alright. I will disable beta on the GTX 770M to avoid it.

Because 18224 is not beta, I can't avoid it that way. Can the species constraints for it be adjusted? This is a project that takes a long time even on an RTX 4090 with CUDA. All the other projects my Radeon 780M gets are met well within the timeout, even when running with the shader clock halved.

Re: 18224 will not make the timeout (Radeon 780M)

Posted: Sun Apr 27, 2025 11:44 am
by muziqaz
I made a request for 18224

Re: 18224 will not make the timeout (Radeon 780M)

Posted: Sun Apr 27, 2025 12:03 pm
by arisu
Thank you.

So I can stop this from happening in the future, I figured out how to blacklist specific projects. In Unit::response(), if I check req.getInputJSON()->get("assignment")->get("data")->get("project") and trigger the same routine that is run when the work server returns HTTP_SERVICE_UNAVAILABLE (basically "retry(); return;"), it should back off and attempt to get another WU. Because it won't download the WU or even send the request to the work server, the WU won't expire and will just get sent to someone else, exactly as if there was a network error at that stage. It should be just a few lines of code and then I can add <project-blacklist v='18224,12705'/> to my config.xml!

And to clarify, I'm not bothered at all because of losing bonus points. I just don't want to slow science down by making someone else fold a duplicate WU.

Re: 18224 will not make the timeout (Radeon 780M)

Posted: Sun Apr 27, 2025 12:11 pm
by arisu
Actually, how long after the timeout expires does a duplicate WU get sent out? Is it immediate during the next credit check or is there a delay? If there is a delay then I'll just let the timeout expire naturally, because in both cases it will finish within half an hour of the timeout expiring.

Re: 18224 will not make the timeout (Radeon 780M)

Posted: Sun Apr 27, 2025 12:29 pm
by muziqaz
Noone knows how long