Page 1 of 2

A8 efficiency - possible unexpected improvement

Posted: Sat Sep 26, 2020 12:52 pm
by BobWilliams757
Just putting out feelers to see if this is unique to my setup or if it is happening on other systems.

In the past I don't often run CPU and GPU folding at the same time. With the integrated graphics on my Ryzen 2400G the overall PPD doesn't go up much, with CPU throughput just reducing GPU throughput. Overall both lose PPD as compared to just running one or the other.

But with the new A8 core, it seems impact on overall throughput is much less. My rise in overall PPD is much greater than before, even after I factor in that the A8 core seems to be delivering higher PPD than the A7 core.

With shared memory for CPU and GPU in this system, this seems to be linked to the new core and how it uses memory. I'm not sure it if was an intended improvement, but it sure is welcome either way.



Are others with conventional (non APU) systems seeing a similar trend?

Re: A8 efficiency - possible unexpected improvement

Posted: Sat Sep 26, 2020 9:58 pm
by bruce
FAHCore_a8 is using a much newer version of GROMACS than FAHCore_a7 and it contains a number of enhancements soit is expected to run faster.

The methods of allocating and sharing memory depend on who designed the hardware so seemingly simple statements about small changes to performance may not apply to somebody with different hardware.

Re: A8 efficiency - possible unexpected improvement

Posted: Sat Oct 31, 2020 11:25 am
by AMDEPYC
I am seeing significantly worse performance with core 0xa8. A 128 physical core 2nd gen EPYC machine that gives 4M - 5M PPD on core 0xa7 with TPF in the 10-20 second range is instead delivering 100K PPD with TPF in the 4-5 minute range. That machine has two 64c folding slots. Since FAH does not have any sort of CPU affinity capabilities, each slot ends up being spread across the two sockets (each socket is one NUMA node), so if there is significantly more data sharing in the new 0xa8 core, that's not good for two socket machines.

Re: A8 efficiency - possible unexpected improvement

Posted: Sat Oct 31, 2020 6:26 pm
by Neil-B
Check that the AS/WS isn't assigning lower thread count WUs to your slots ... current low availability of CPU WUs means my slots (a 32 and a 24) are seeing 10 thread WUs assigned occasionally - this would impact PPD and only shows up if you look at logs - web and advanced controls just look as if the slot is running very slot ... not an issue with the core - more a lack of CPU WUs for larger slots (at least one project has a max 10 thread assignment rule at the moment).

Re: A8 efficiency - possible unexpected improvement

Posted: Sun Nov 01, 2020 7:05 am
by PantherX
Welcome to the F@H Forum AMDEPYC,

Can you please post the log file? Ensure you include the first 100 lines which will inform us of what the system configuration is and what the client settings are. If you require guidance, please view this topic: viewtopic.php?f=24&t=26036

In all our testing, FahCore_a8 has always provided more performance than FahCore_a7. However, we haven't tested it with 128 physical CPUs so the log file would provide some additional information to help us troubleshoot :)

Re: A8 efficiency - possible unexpected improvement

Posted: Sun Nov 01, 2020 5:05 pm
by AMDEPYC
My normal slot config is two CPU slots, each with 64 cores. Currently one of those is paused and I have a mix of sizes 4/8/12/16 CPU running to characterize this a bit more. Below is a grab from core 0xa8 on the 12 CPU slot. If not obvious from the log, SMT is disabled so this is a dual proc machine with total 128 physical cores.

Code: Select all

15:43:02:WU05:FS05:Connecting to assign1.foldingathome.org:80
15:43:02:WU05:FS05:Assigned to work server 178.174.196.138
15:43:02:WU05:FS05:Requesting new work unit for slot 05: READY cpu:12 from 178.174.196.138
15:43:02:WU05:FS05:Connecting to 178.174.196.138:8080
15:43:02:WU05:FS05:Downloading 2.05MiB
15:43:04:WU05:FS05:Download complete
15:43:04:WU05:FS05:Received Unit: id:05 state:DOWNLOAD error:NO_ERROR project:16812 run:2 clone:960 gen:39 core:0xa8 unit:0x0000002cb2aec48a5f74f1468924fbf1
15:43:04:WU05:FS05:Starting
15:43:04:WU05:FS05:Running FahCore: /usr/bin/FAHCoreWrapper /home/amd/FAH/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.9/Core_a8.fah/FahCore_a8 -dir 05 -suffix 01 -version 706 -lifeline 2490 -checkpoint 5 -np 12
15:43:04:WU05:FS05:Started FahCore on PID 238066
15:43:04:WU05:FS05:Core PID:238070
15:43:04:WU05:FS05:FahCore 0xa8 started
15:43:04:WU05:FS05:0xa8:*********************** Log Started 2020-10-31T15:43:04Z ***********************
15:43:04:WU05:FS05:0xa8:************************** Gromacs Folding@home Core ***************************
15:43:04:WU05:FS05:0xa8:       Core: Gromacs
15:43:04:WU05:FS05:0xa8:       Type: 0xa8
15:43:04:WU05:FS05:0xa8:    Version: 0.0.9
15:43:04:WU05:FS05:0xa8:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:43:04:WU05:FS05:0xa8:  Copyright: 2020 foldingathome.org
15:43:04:WU05:FS05:0xa8:   Homepage: https://foldingathome.org/
15:43:04:WU05:FS05:0xa8:       Date: Oct 28 2020
15:43:04:WU05:FS05:0xa8:       Time: 22:15:07
15:43:04:WU05:FS05:0xa8:   Compiler: GNU 8.3.0
15:43:04:WU05:FS05:0xa8:    Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
15:43:04:WU05:FS05:0xa8:             -fdata-sections -O3 -funroll-loops -fno-pie
15:43:04:WU05:FS05:0xa8:   Platform: linux2 4.15.0-108-generic
15:43:04:WU05:FS05:0xa8:       Bits: 64
15:43:04:WU05:FS05:0xa8:       Mode: Release
15:43:04:WU05:FS05:0xa8:       SIMD: avx2_256
15:43:04:WU05:FS05:0xa8:     OpenMP: ON
15:43:04:WU05:FS05:0xa8:       CUDA: OFF
15:43:04:WU05:FS05:0xa8:       Args: -dir 05 -suffix 01 -version 706 -lifeline 238066 -checkpoint 5 -np
15:43:04:WU05:FS05:0xa8:             12
15:43:04:WU05:FS05:0xa8:************************************ libFAH ************************************
15:43:04:WU05:FS05:0xa8:       Date: Oct 28 2020
15:43:04:WU05:FS05:0xa8:       Time: 22:12:00
15:43:04:WU05:FS05:0xa8:   Compiler: GNU 8.3.0
15:43:04:WU05:FS05:0xa8:    Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
15:43:04:WU05:FS05:0xa8:             -fdata-sections -O3 -funroll-loops -fno-pie
15:43:04:WU05:FS05:0xa8:   Platform: linux2 4.15.0-108-generic
15:43:04:WU05:FS05:0xa8:       Bits: 64
15:43:04:WU05:FS05:0xa8:       Mode: Release
15:43:04:WU05:FS05:0xa8:************************************ CBang *************************************
15:43:04:WU05:FS05:0xa8:       Date: Oct 28 2020
15:43:04:WU05:FS05:0xa8:       Time: 22:11:46
15:43:04:WU05:FS05:0xa8:   Compiler: GNU 8.3.0
15:43:04:WU05:FS05:0xa8:    Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
15:43:04:WU05:FS05:0xa8:             -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
15:43:04:WU05:FS05:0xa8:   Platform: linux2 4.15.0-108-generic
15:43:04:WU05:FS05:0xa8:       Bits: 64
15:43:04:WU05:FS05:0xa8:       Mode: Release
15:43:04:WU05:FS05:0xa8:************************************ System ************************************
15:43:04:WU05:FS05:0xa8:        CPU: AMD EPYC 7H12 64-Core Processor
15:43:04:WU05:FS05:0xa8:     CPU ID: AuthenticAMD Family 23 Model 49 Stepping 0
15:43:04:WU05:FS05:0xa8:       CPUs: 128
15:43:04:WU05:FS05:0xa8:     Memory: 503.74GiB
15:43:04:WU05:FS05:0xa8:Free Memory: 498.97GiB
15:43:04:WU05:FS05:0xa8:    Threads: POSIX_THREADS
15:43:04:WU05:FS05:0xa8: OS Version: 5.8
15:43:04:WU05:FS05:0xa8:Has Battery: false
15:43:04:WU05:FS05:0xa8: On Battery: false
15:43:04:WU05:FS05:0xa8: UTC Offset: -4
15:43:04:WU05:FS05:0xa8:        PID: 238070
15:43:04:WU05:FS05:0xa8:        CWD: /home/amd/FAH/work
15:43:04:WU05:FS05:0xa8:********************************************************************************
15:43:04:WU05:FS05:0xa8:Project: 16812 (Run 2, Clone 960, Gen 39)
15:43:04:WU05:FS05:0xa8:Unit: 0x0000002cb2aec48a5f74f1468924fbf1
15:43:04:WU05:FS05:0xa8:Reading tar file core.xml
15:43:04:WU05:FS05:0xa8:Reading tar file frame39.tpr
15:43:04:WU05:FS05:0xa8:Digital signatures verified
15:43:04:WU05:FS05:0xa8:Calling: mdrun -c frame39.gro -s frame39.tpr -x frame39.xtc -cpt 5 -nt 12 -ntmpi 1
15:43:04:WU05:FS05:0xa8:Steps: first=19500000 total=20000000
15:43:06:WU05:FS05:0xa8:Completed 1 out of 500000 steps (0%)
15:44:01:WU05:FS05:0xa8:Completed 5000 out of 500000 steps (1%)
15:44:56:WU05:FS05:0xa8:Completed 10000 out of 500000 steps (2%)

Re: A8 efficiency - possible unexpected improvement

Posted: Sun Nov 01, 2020 5:11 pm
by AMDEPYC
A really quick fix (for me) would be to enable the GROMACS mdrun -pin -pinoffset -pinstride and -ntomp (and allow overriding the -ntmpi) - these could be submitted via slot extra-core-args. On EPYC pinning is a must (to take advantage of the huge L3) and if there is shared memory, shared data, sync barriers (OpenMP has a lot), etc. then it's also handy to co-locate threads by L3 cache (if small), or by NUMA node (if larger). Most hybrid HPC applications at a scale of 64+ cores tend to perform best with 4 OpenMP threads per MPI rank (matching the number of physical cores per L3 cache) and mapping each rank to an L3.

For comparison, the 64 core slot just landed an 0xa7 WU:

Code: Select all

17:06:28:WU02:FS00:Connecting to assign1.foldingathome.org:80
17:06:28:WU02:FS00:Assigned to work server 128.252.203.9
17:06:28:WU02:FS00:Requesting new work unit for slot 00: READY cpu:64 from 128.252.203.9
17:06:28:WU02:FS00:Connecting to 128.252.203.9:8080
17:06:29:WU02:FS00:Downloading 8.14MiB
17:06:30:WU02:FS00:Download complete
17:06:30:WU02:FS00:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:13821 run:225 clone:5 gen:18 core:0xa7 unit:0x0000001480fccb095e73d2f04caa75eb
17:06:30:WU02:FS00:Starting
17:06:30:WU02:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /home/amd/FAH/cores/cores.foldingathome.org/lin/64bit-avx-256/a7-0.0.19/Core_a7.fah/FahCore_a7 -dir 02 -suffix 01 -version 706 -lifeline 2490 -checkpoint 5 -np 64
17:06:30:WU02:FS00:Started FahCore on PID 243411
17:06:30:WU02:FS00:Core PID:243415
17:06:30:WU02:FS00:FahCore 0xa7 started
17:06:30:WU02:FS00:0xa7:*********************** Log Started 2020-11-01T17:06:30Z ***********************
17:06:30:WU02:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
17:06:30:WU02:FS00:0xa7:       Type: 0xa7
17:06:30:WU02:FS00:0xa7:       Core: Gromacs
17:06:30:WU02:FS00:0xa7:       Args: -dir 02 -suffix 01 -version 706 -lifeline 243411 -checkpoint 5 -np
17:06:30:WU02:FS00:0xa7:             64
17:06:30:WU02:FS00:0xa7:************************************ CBang *************************************
17:06:30:WU02:FS00:0xa7:       Date: Nov 27 2019
17:06:30:WU02:FS00:0xa7:       Time: 11:26:54
17:06:30:WU02:FS00:0xa7:   Revision: d25803215b59272441049dfa05a0a9bf7a6e3c48
17:06:30:WU02:FS00:0xa7:     Branch: master
17:06:30:WU02:FS00:0xa7:   Compiler: GNU 8.3.0
17:06:30:WU02:FS00:0xa7:    Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
17:06:30:WU02:FS00:0xa7:             -fno-pie -fPIC
17:06:30:WU02:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
17:06:30:WU02:FS00:0xa7:       Bits: 64
17:06:30:WU02:FS00:0xa7:       Mode: Release
17:06:30:WU02:FS00:0xa7:************************************ System ************************************
17:06:30:WU02:FS00:0xa7:        CPU: AMD EPYC 7H12 64-Core Processor
17:06:30:WU02:FS00:0xa7:     CPU ID: AuthenticAMD Family 23 Model 49 Stepping 0
17:06:30:WU02:FS00:0xa7:       CPUs: 128
17:06:30:WU02:FS00:0xa7:     Memory: 503.74GiB
17:06:30:WU02:FS00:0xa7:Free Memory: 499.83GiB
17:06:30:WU02:FS00:0xa7:    Threads: POSIX_THREADS
17:06:30:WU02:FS00:0xa7: OS Version: 5.8
17:06:30:WU02:FS00:0xa7:Has Battery: false
17:06:30:WU02:FS00:0xa7: On Battery: false
17:06:30:WU02:FS00:0xa7: UTC Offset: -5
17:06:30:WU02:FS00:0xa7:        PID: 243415
17:06:30:WU02:FS00:0xa7:        CWD: /home/amd/FAH/work
17:06:30:WU02:FS00:0xa7:******************************** Build - libFAH ********************************
17:06:30:WU02:FS00:0xa7:    Version: 0.0.19
17:06:30:WU02:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
17:06:30:WU02:FS00:0xa7:  Copyright: 2019 foldingathome.org
17:06:30:WU02:FS00:0xa7:   Homepage: https://foldingathome.org/
17:06:30:WU02:FS00:0xa7:       Date: Nov 26 2019
17:06:30:WU02:FS00:0xa7:       Time: 00:41:42
17:06:30:WU02:FS00:0xa7:   Revision: d5b5c747532224f986b7cd02c968ed9a20c16d6e
17:06:30:WU02:FS00:0xa7:     Branch: master
17:06:30:WU02:FS00:0xa7:   Compiler: GNU 8.3.0
17:06:30:WU02:FS00:0xa7:    Options: -std=c++11 -ffunction-sections -fdata-sections -O3 -funroll-loops
17:06:30:WU02:FS00:0xa7:             -fno-pie
17:06:30:WU02:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
17:06:30:WU02:FS00:0xa7:       Bits: 64
17:06:30:WU02:FS00:0xa7:       Mode: Release
17:06:30:WU02:FS00:0xa7:************************************ Build *************************************
17:06:30:WU02:FS00:0xa7:       SIMD: avx_256
17:06:30:WU02:FS00:0xa7:********************************************************************************
17:06:30:WU02:FS00:0xa7:Project: 13821 (Run 225, Clone 5, Gen 18)
17:06:30:WU02:FS00:0xa7:Unit: 0x0000001480fccb095e73d2f04caa75eb
17:06:30:WU02:FS00:0xa7:Reading tar file core.xml
17:06:30:WU02:FS00:0xa7:Reading tar file frame18.tpr
17:06:30:WU02:FS00:0xa7:Digital signatures verified
17:06:30:WU02:FS00:0xa7:Calling: mdrun -s frame18.tpr -o frame18.trr -x frame18.xtc -cpt 5 -nt 64
17:06:30:WU02:FS00:0xa7:Steps: first=2250000 total=125000
17:06:33:WU02:FS00:0xa7:Completed 1 out of 125000 steps (0%)
17:06:45:WU02:FS00:0xa7:Completed 1250 out of 125000 steps (1%)
17:06:54:WU02:FS00:0xa7:Completed 2500 out of 125000 steps (2%)
17:07:03:WU02:FS00:0xa7:Completed 3750 out of 125000 steps (3%)

Re: A8 efficiency - possible unexpected improvement

Posted: Sun Nov 01, 2020 5:39 pm
by Joe_H
They ran into some issues with the A8 core compiled with a different '-ntmpi' setting than is currently released. For now they have released with that setting of '-ntmpi 1' as it works on all systems, though perhaps not efficiently on large server systems. Testing is going on for using a different ntmpi setting, when they have worked through the issues a version including that change should be released.

One issue you may run into with the A7 core is domain decomposition problems. Not all thread counts work, so that can complicate assignments. One of the goals for the A8 core is to handle domain decomposition a bit more gracefully.

Re: A8 efficiency - possible unexpected improvement

Posted: Sun Nov 01, 2020 7:25 pm
by AMDEPYC
Finally got a core 0xa8 WU on the 64 CPU folding slot. TPF varies significantly over the run. This is most likely the result of memory and CPUs being scattered around due to no affinity, then the scheduler or automatic NUMA balancing trying to make things better. Only 10 threads launched though even though the slot has 64 CPUs.

Code: Select all

17:22:09:WU01:FS00:Connecting to assign1.foldingathome.org:80
17:22:09:WU01:FS00:Assigned to work server 129.32.209.204
17:22:09:WU01:FS00:Requesting new work unit for slot 00: READY cpu:64 from 129.32.209.204
17:22:09:WU01:FS00:Connecting to 129.32.209.204:8080
17:22:09:WU01:FS00:Downloading 54.50KiB
17:22:09:WU01:FS00:Download complete
17:22:09:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:16926 run:0 clone:30 gen:15 core:0xa8 unit:0x0000000f8120d1cc5f7dfda0755570c3
17:22:09:WU01:FS00:Starting
17:22:09:WARNING:WU01:FS00:AS lowered CPUs from 64 to 10
17:22:09:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /home/amd/FAH/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.9/Core_a8.fah/FahCore_a8 -dir 01 -suffix 01 -version 706 -lifeline 2490 -checkpoint 5 -np 10
17:22:09:WU01:FS00:Started FahCore on PID 243493
17:22:09:WU01:FS00:Core PID:243497
17:22:09:WU01:FS00:FahCore 0xa8 started
17:22:10:WU01:FS00:0xa8:*********************** Log Started 2020-11-01T17:22:09Z ***********************
17:22:10:WU01:FS00:0xa8:************************** Gromacs Folding@home Core ***************************
17:22:10:WU01:FS00:0xa8:       Core: Gromacs
17:22:10:WU01:FS00:0xa8:       Type: 0xa8
17:22:10:WU01:FS00:0xa8:    Version: 0.0.9
17:22:10:WU01:FS00:0xa8:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
17:22:10:WU01:FS00:0xa8:  Copyright: 2020 foldingathome.org
17:22:10:WU01:FS00:0xa8:   Homepage: https://foldingathome.org/
17:22:10:WU01:FS00:0xa8:       Date: Oct 28 2020
17:22:10:WU01:FS00:0xa8:       Time: 22:15:07
17:22:10:WU01:FS00:0xa8:   Compiler: GNU 8.3.0
17:22:10:WU01:FS00:0xa8:    Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
17:22:10:WU01:FS00:0xa8:             -fdata-sections -O3 -funroll-loops -fno-pie
17:22:10:WU01:FS00:0xa8:   Platform: linux2 4.15.0-108-generic
17:22:10:WU01:FS00:0xa8:       Bits: 64
17:22:10:WU01:FS00:0xa8:       Mode: Release
17:22:10:WU01:FS00:0xa8:       SIMD: avx2_256
17:22:10:WU01:FS00:0xa8:     OpenMP: ON
17:22:10:WU01:FS00:0xa8:       CUDA: OFF
17:22:10:WU01:FS00:0xa8:       Args: -dir 01 -suffix 01 -version 706 -lifeline 243493 -checkpoint 5 -np
17:22:10:WU01:FS00:0xa8:             10
17:22:10:WU01:FS00:0xa8:************************************ libFAH ************************************
17:22:10:WU01:FS00:0xa8:       Date: Oct 28 2020
17:22:10:WU01:FS00:0xa8:       Time: 22:12:00
17:22:10:WU01:FS00:0xa8:   Compiler: GNU 8.3.0
17:22:10:WU01:FS00:0xa8:    Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
17:22:10:WU01:FS00:0xa8:             -fdata-sections -O3 -funroll-loops -fno-pie
17:22:10:WU01:FS00:0xa8:   Platform: linux2 4.15.0-108-generic
17:22:10:WU01:FS00:0xa8:       Bits: 64
17:22:10:WU01:FS00:0xa8:       Mode: Release
17:22:10:WU01:FS00:0xa8:************************************ CBang *************************************
17:22:10:WU01:FS00:0xa8:       Date: Oct 28 2020
17:22:10:WU01:FS00:0xa8:       Time: 22:11:46
17:22:10:WU01:FS00:0xa8:   Compiler: GNU 8.3.0
17:22:10:WU01:FS00:0xa8:    Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
17:22:10:WU01:FS00:0xa8:             -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
17:22:10:WU01:FS00:0xa8:   Platform: linux2 4.15.0-108-generic
17:22:10:WU01:FS00:0xa8:       Bits: 64
17:22:10:WU01:FS00:0xa8:       Mode: Release
17:22:10:WU01:FS00:0xa8:************************************ System ************************************
17:22:10:WU01:FS00:0xa8:        CPU: AMD EPYC 7H12 64-Core Processor
17:22:10:WU01:FS00:0xa8:     CPU ID: AuthenticAMD Family 23 Model 49 Stepping 0
17:22:10:WU01:FS00:0xa8:       CPUs: 128
17:22:10:WU01:FS00:0xa8:     Memory: 503.74GiB
17:22:10:WU01:FS00:0xa8:Free Memory: 499.83GiB
17:22:10:WU01:FS00:0xa8:    Threads: POSIX_THREADS
17:22:10:WU01:FS00:0xa8: OS Version: 5.8
17:22:10:WU01:FS00:0xa8:Has Battery: false
17:22:10:WU01:FS00:0xa8: On Battery: false
17:22:10:WU01:FS00:0xa8: UTC Offset: -5
17:22:10:WU01:FS00:0xa8:        PID: 243497
17:22:10:WU01:FS00:0xa8:        CWD: /home/amd/FAH/work
17:22:10:WU01:FS00:0xa8:********************************************************************************
17:22:10:WU01:FS00:0xa8:Project: 16926 (Run 0, Clone 30, Gen 15)
17:22:10:WU01:FS00:0xa8:Unit: 0x0000000f8120d1cc5f7dfda0755570c3
17:22:10:WU01:FS00:0xa8:Reading tar file core.xml
17:22:10:WU01:FS00:0xa8:Reading tar file frame15.tpr
17:22:10:WU01:FS00:0xa8:Digital signatures verified
17:22:10:WU01:FS00:0xa8:Calling: mdrun -c frame15.gro -s frame15.tpr -x frame15.xtc -cpt 5 -nt 10 -ntmpi 1
17:22:10:WU01:FS00:0xa8:Steps: first=750000000 total=800000000
17:22:10:WU01:FS00:0xa8:Completed 1 out of 50000000 steps (0%)
17:25:45:WU01:FS00:0xa8:Completed 500000 out of 50000000 steps (1%)
17:30:20:WU01:FS00:0xa8:Completed 1000000 out of 50000000 steps (2%)
17:35:16:WU01:FS00:0xa8:Completed 1500000 out of 50000000 steps (3%)
17:40:12:WU01:FS00:0xa8:Completed 2000000 out of 50000000 steps (4%)
17:43:56:WU01:FS00:0xa8:Completed 2500000 out of 50000000 steps (5%)
17:46:20:WU01:FS00:0xa8:Completed 3000000 out of 50000000 steps (6%)
17:48:31:WU01:FS00:0xa8:Completed 3500000 out of 50000000 steps (7%)
17:50:42:WU01:FS00:0xa8:Completed 4000000 out of 50000000 steps (8%)
17:52:53:WU01:FS00:0xa8:Completed 4500000 out of 50000000 steps (9%)
17:55:07:WU01:FS00:0xa8:Completed 5000000 out of 50000000 steps (10%)
17:57:20:WU01:FS00:0xa8:Completed 5500000 out of 50000000 steps (11%)
17:59:33:WU01:FS00:0xa8:Completed 6000000 out of 50000000 steps (12%)
18:01:47:WU01:FS00:0xa8:Completed 6500000 out of 50000000 steps (13%)
18:04:00:WU01:FS00:0xa8:Completed 7000000 out of 50000000 steps (14%)
18:06:13:WU01:FS00:0xa8:Completed 7500000 out of 50000000 steps (15%)

Re: A8 efficiency - possible unexpected improvement

Posted: Sun Nov 01, 2020 7:38 pm
by Joe_H

Code: Select all

17:22:09:WARNING:WU01:FS00:AS lowered CPUs from 64 to 10[
The AS did not have WUs that could use 64 threads and assigned you an available WU capped at 10 threads.

Re: A8 efficiency - possible unexpected improvement

Posted: Sun Nov 01, 2020 10:39 pm
by Neil-B
Yup ... as mentioned earlier ... it will improve as more cpu projects get released but for now expect some of these slot count reductions to happen.

Re: A8 efficiency - possible unexpected improvement

Posted: Mon Nov 02, 2020 12:23 am
by AMDEPYC
What do you guys need - more 10-12 CPU slots or fewer bigger slots? Or mix of both?

Re: A8 efficiency - possible unexpected improvement

Posted: Mon Nov 02, 2020 6:47 am
by PantherX
It's a tough question to answer but you can make an informed choice:

Ideally
Few large CPU Slots (you can turn on HT too in this case), maybe combination of 32/64/128 etc.
Currently, there's a shortage of CPU Projects and large projects that scales to 128+ CPUs is very limited.

Set-&-forget
Multiple CPU Slots (using physical Cores without HT), maybe a combination of 8/12/16 etc.
Apart from the CPU WU shortage, there's more likelihood of CPU Projects in the "low" CPU range than in the high CPU range.

Related question, is this system a dedicated folding system or does it run other applications too? Is it on 24/7? By having a bit more information about how this system of yours is going to be used, we can potentially find an optimum configuration that simply works the best for you :)

Re: A8 efficiency - possible unexpected improvement

Posted: Mon Nov 02, 2020 6:51 am
by Neil-B
It depends how you look at it tbh .. this is hopefully a short term issue .. you could create more lower count slots but the pauses in assignment of even these indicates more resource than work so still might not fully use you kit .. what is really needed are more cpu projects but these take time to generate .. strangely this is a good issue to have in some ways as it means the donated resources are able to fold projects as quick as the researchers can get them out :)

Re: A8 efficiency - possible unexpected improvement

Posted: Mon Nov 02, 2020 12:29 pm
by AMDEPYC
Replying to @PantherX - I have two systems dedicated for folding:
Ryzen 9 3950X + RX560 GPU - one CPU slot, 12 cores (SMT off), one GPU slot
2P EPYC 7601 - two CPU slots, 32 cores (SMT off)

The other two systems are used throughout the day:
Ryzen 9 3950X + RX5500XT - one CPU slot, 12 cores (SMT off), one GPU slot
2P EPYC 7H12 - two CPU slots, 64 cores (SMT off)

All are configured as client-type beta. I usually have enough runway to finish in-process jobs on the above two systems when I need them so I leave them folding 24x7 as well. Plus as it's starting to get into winter, the ~2kW of total power from the above means less running the heater, though it wasn't too friendly on the electric bill this summer.