Search found 69 matches

by jjmiller
Mon Jul 12, 2021 1:31 pm
Forum: Issues with a specific server
Topic: 128.252.203.11 is not giving work units
Replies: 24
Views: 25470

Re: 128.252.203.11 is not giving work units

Hi all, thanks for the updates on this. Just getting back into the office after a week out. Checking on this now and will report back when I know more.
by jjmiller
Mon Jul 12, 2021 1:30 pm
Forum: Issues with a specific server
Topic: 128.252.203.10 and 128.252.203.12 struggling
Replies: 9
Views: 9654

Re: 128.252.203.10 and 128.252.203.12 struggling

Thanks all- just getting back into the office after a week out. Checking on these now.
by jjmiller
Wed Jul 07, 2021 11:51 am
Forum: Issues with a specific server
Topic: 128.252.203.11 is not giving work units
Replies: 24
Views: 25470

Re: 128.252.203.11 is not giving work units

Hi autogrog, Thank you for posting about this and apologies for the delayed response as I'm current OOO. At the moment as far as we can tell this is an issue with the amount of load going to 128.252.203.11(12) as these two servers currently house the bulk of the available FAH jobs. To prevent server...
by jjmiller
Wed Jul 07, 2021 11:37 am
Forum: Issues with a specific WU
Topic: 18202 massive download
Replies: 2
Views: 5138

Re: 18202 massive download

Hi aetech, Thanks for the note and apologies about the slow delay as I was OOO. Compression is reenabled for this project and should be hitting any new jobs that go out. In an effort to increase throughput on individual servers we were testing whether compression was needed for this job. Compression...
by jjmiller
Thu Jul 01, 2021 6:36 pm
Forum: Issues with a specific server
Topic: 128.252.203.11 is not giving work units
Replies: 24
Views: 25470

Re: 128.252.203.11 is not giving work units

Hi all, Thanks for the notes on this. As gunnarre highlights, the bulk of GPU jobs available on FAH are currently housed on this server, meaning that a lot of the FAH GPU folders are currently bearing down on this server. To manage this load, we have set a relatively strict assignment cap on this se...
by jjmiller
Wed Jun 30, 2021 8:09 pm
Forum: Issues with a specific server
Topic: 128.252.203.10 and 128.252.203.12 struggling
Replies: 9
Views: 9654

Re: 128.252.203.10 and 128.252.203.12 struggling

Thank you for the notice. I went in and dialed back the assignment rates for the jobs uploading to these two servers.
by jjmiller
Mon Jun 28, 2021 4:53 pm
Forum: Announcements - Folding Consortium
Topic: CPU (A8) projects 18210-18212 on FAH
Replies: 0
Views: 6794

CPU (A8) projects 18210-18212 on FAH

Projects 18210-18212 are now on folding at home! https://stats.foldingathome.org/project/18210 https://stats.foldingathome.org/project/18211 https://stats.foldingathome.org/project/18212 Stats: <atoms v="302700"/> <timeout v="2"/> <deadline v="5"/> <stats-credit v="...
by jjmiller
Mon Jun 28, 2021 2:00 pm
Forum: Issues with a specific WU
Topic: Project: 18202 (Run 1345, Clone 3, Gen 9) dumped
Replies: 16
Views: 19363

Re: Project: 18202 (Run 1345, Clone 3, Gen 9) dumped

Ah, I see- thanks for the clarification. I'll check in with our collection server again and will report back! EDIT- Since we have many WUs coming back successfully when uploaded to the work server and all of these reports seem to follow a consistent trend (WS busy -> upload to CS -> CS dumps) we've ...
by jjmiller
Sun Jun 27, 2021 6:41 pm
Forum: Issues with a specific WU
Topic: Project: 18202 (Run 1345, Clone 3, Gen 9) dumped
Replies: 16
Views: 19363

Re: Project: 18202 (Run 1345, Clone 3, Gen 9) dumped

Hi Aetch, The log you posted is for Project 17804, Run 83, Clone 289, Gen 77. Would you be willing to post the log for project:18202 run:1927 clone:2 gen:5? Some of the gap that you're seeing on the WU status app is because we actually took this project down for ~8 days to resolve some server issues...
by jjmiller
Sat Jun 26, 2021 6:22 pm
Forum: Announcements - Folding Consortium
Topic: [GPU] Project 18201 now on FAH (GPU, OpenMM Core22)
Replies: 2
Views: 5706

Re: [GPU] Project 18201 now on FAH (GPU, OpenMM Core22)

Hi all, Apologies as I forgot to update this post last week. Approximately one and a half weeks ago highland1, the work server which this project is housed on, was overloaded and then suffered several performance issues as a result. We temporarily halted jobs last week in attempt to resolve these se...
by jjmiller
Sat Jun 26, 2021 6:19 pm
Forum: Announcements - Folding Consortium
Topic: CPU (GRO_A8) project 18206 on FAH
Replies: 1
Views: 4928

Re: CPU (GRO_A8) project 18206 on FAH

Hi all, Apologies as I forgot to update this post last week. Approximately one and a half weeks ago highland1, the work server which this project is housed on, was overloaded and then suffered several performance issues as a result. We temporarily halted jobs last week in attempt to resolve these se...
by jjmiller
Sat Jun 26, 2021 6:17 pm
Forum: Announcements - Folding Consortium
Topic: [GPU] Project 18202 now on FAH (GPU, OpenMM Core22)
Replies: 4
Views: 7405

Re: [GPU] Project 18202 now on FAH (GPU, OpenMM Core22)

Hi all, Apologies as I forgot to update this post last week. Approximately one and a half weeks ago highland1, the work server which this project is housed on, was overloaded and then suffered several performance issues as a result. We temporarily halted jobs last week in attempt to resolve these se...
by jjmiller
Fri Jun 25, 2021 1:50 pm
Forum: V7.6.x Public Release Windows/Linux/MacOS X
Topic: No GPU WU, shortage or other cause?
Replies: 14
Views: 17140

Re: No GPU WU, shortage or other cause?

Hi all, Sorry for the silence from FAH science team. There is indeed both a GPU and CPU work unit shortage. A couple of reasons drive this- first and foremost we do our best to make sure that every WU we release is scientifically valuable so that you are not wasting your time or electricity on meani...
by jjmiller
Thu Jun 24, 2021 8:43 pm
Forum: Issues with a specific WU
Topic: Project: 18202 (Run 1345, Clone 3, Gen 9) dumped
Replies: 16
Views: 19363

Re: Project: 18202 (Run 1345, Clone 3, Gen 9) dumped

The configuration of 128.252.203.11 is really strange in that it functions as both a WS and a CS. One thing that we've noticed so far is that in the WUs being dumped our trajectory file hasn't been updated with the newest generations that have come back. For example, in R1245:C3:G9 the trajectory fi...
by jjmiller
Thu Jun 24, 2021 8:28 pm
Forum: Issues with a specific WU
Topic: Project: 18202 (Run 1345, Clone 3, Gen 9) dumped
Replies: 16
Views: 19363

Re: Project: 18202 (Run 1345, Clone 3, Gen 9) dumped

Hi mgetz, Yes, 18202 went through internal and beta testing and performed well in both. One issue that it seems like we see (especially for bigger systems as we have in 18201/18202) is that differences in the GPU being used can severely change the stability of a project. Based on the current stats, ...