Thanks for the heads-up. I'm investigating.
Is this the only server you're having trouble with? I'm showing lots of WUs moving in and out of the server (plfah1-1.mskcc.org) just fine.
Search found 339 matches
- Sun Jan 08, 2017 7:31 am
- Forum: Issues with a specific server
- Topic: 140.163.4.231 not actually downloading work
- Replies: 8
- Views: 2054
- Thu Dec 22, 2016 7:03 pm
- Forum: Issues with a specific WU
- Topic: Project 10496 impossible to finish
- Replies: 17
- Views: 4342
Re: Project 10496 impossible to finish
Thanks for all the feedback, everyone. This is a HUGE system---much bigger than we normally simulate---and we seem to have underestimated the deadline and timeout as a result. I've temporarily set the assignment weight for this project to 0 until we can figure out (1) what an appropriate deadline/ti...
- Mon Nov 21, 2016 3:38 am
- Forum: Scientific discussions (Non-FAH)
- Topic: XFEL - Folding could be "filmed" in the near future ?
- Replies: 3
- Views: 3391
Re: XFEL - Folding could be "filmed" in the near future ?
Lots of structural biologists are excited for this to come online, but I believe the first X-ray free-electron laser (XFEL) for use in biomolecular crystallography is actually at Stanford and went into service in 2010: https://portal.slac.stanford.edu/sites/ ... fault.aspx https://www6.slac.stanford...
- Tue Oct 25, 2016 10:55 pm
- Forum: GPU Projects and FahCores
- Topic: When the core 22 will be tested?
- Replies: 45
- Views: 15088
Re: When the core 22 will be tested?
Thanks for your interest! We're in the very early stages of building test builds for a new GPU core 22 based on OpenMM 7.0.1, but we don't yet have a timeline other than "soon!". There are a few technical and infrastructure hurdles to straighten out first.
- Sat Mar 26, 2016 12:28 am
- Forum: Issues with a specific server
- Topic: Cant upload the WU in the time allowed.
- Replies: 52
- Views: 10790
Re: Cant upload the WU in the time allowed.
I've just doubled the timeout (to 1 hour). Please report any other failed uploads and how long they took to timeout. There may be a bug causing the early timeout.
- Sat Mar 26, 2016 12:27 am
- Forum: Issues with a specific server
- Topic: Cant upload the WU in the time allowed.
- Replies: 52
- Views: 10790
Re: Cant upload the WU in the time allowed.
That's odd. This server is set to use a 30 minute timeout, but it looks like the upload timed out after ~13 minutes. That's really strange. Note that these servers don't answer to pings from the outside world because of firewall restrictions. We're working to try to get these lifted to make debuggin...
- Sat Mar 26, 2016 12:23 am
- Forum: Issues with a specific server
- Topic: Cant upload the WU in the time allowed.
- Replies: 52
- Views: 10790
Re: Cant upload the WU in the time allowed.
Thanks everybody for alerting us to issues with the MSK servers (140.163.4.24x). Looking into this now.
- Tue Feb 16, 2016 6:27 pm
- Forum: Issues with a specific server
- Topic: 140.163.4.242 slow WU dl
- Replies: 1
- Views: 1237
Re: 140.163.4.242 slow WU dl
Wow, that's *really* slow. Let me check with our networking folks to see if there is something weird going on.
John
John
- Tue Jan 19, 2016 4:51 pm
- Forum: Issues with a specific WU
- Topic: No credit P10468 and P10484
- Replies: 3
- Views: 1251
Re: No credit P10468 and P10484
Server fixed. Points should be credited soon.
Thanks again for your help!
J
Thanks again for your help!
J
- Tue Jan 19, 2016 4:27 pm
- Forum: Issues with a specific WU
- Topic: No credit P10468 and P10484
- Replies: 3
- Views: 1251
Re: No credit P10468 and P10484
So sorry about this! Many thanks for the report---I see these servers are suddenly not allowing inbound ssh connections, which is why credit hasn't been applied yet.
Working on fixing thins now.
J
Working on fixing thins now.
J
- Sun Jan 17, 2016 10:45 pm
- Forum: GPU Projects and FahCores
- Topic: Core 21 Bad Work Unit issues
- Replies: 20
- Views: 6969
Re: Core 21 Bad Work Unit issues
Sorry you're having trouble here! (And thanks to toTOW for helping debug---we've been communicating about this.) Unfortunately, the current core 21 maps a number of errors to the same "BAD_WORK_UNIT (114 = 0x72)" return value. We're working on fixing this to provide more useful information...
- Wed Dec 30, 2015 6:17 pm
- Forum: Announcements - Folding Consortium
- Topic: Core21 v0.0.17 has been released!
- Replies: 0
- Views: 2604
Core21 v0.0.17 has been released!
We rolled out Core21 v0.0.17 earlier this week. Summary of changes: * Checkpoint sanity checks are now multithreaded (when supported by hardware). When possible, the multithreaded OpenMM CPU platform is now used for sanity checks prior to checkpointing instead of the much slower single-threaded Refe...
- Mon Dec 07, 2015 6:41 pm
- Forum: Announcements - Folding Consortium
- Topic: Core21 v0.0.14 has been released!
- Replies: 0
- Views: 2093
Core21 v0.0.14 has been released!
We rolled out Core21 v0.0.14 last week. Main features are: * Bad State errors fixed: Improvements that drastically reduce the high rate of Bad State errors we were seeing with earlier versions of the core, especially with NVIDIA cards. These Bad State errors were ironically caused by a couple of bug...
- Mon Dec 07, 2015 4:53 pm
- Forum: Issues with a specific server
- Topic: Unable to download new WU
- Replies: 63
- Views: 16379
Re: Unable to download new WU
I think we may be running low on full FAH jobs since all core 21 projects were rolled back to ADV or earlier. My lab is advancing a few projects to FAH to help make sure sufficient jobs are available.
Will check in with Stanford a bit later today to make sure there are now AS/WS issues.
Will check in with Stanford a bit later today to make sure there are now AS/WS issues.
- Sat Dec 05, 2015 4:39 am
- Forum: Issues with a specific WU
- Topic: Core 21 failures on GTX970
- Replies: 23
- Views: 9349
Re: Core 21 failures on GTX970
> This is the TDR issue ... now, is triggered by faulty hardware or by software ?
Are you sure this is the same as the TDR? Project 11411 is not very large, so it would be surprising if one of the GPU kernels was exceeding the windows timeout on a GTX 970.
Are you sure this is the same as the TDR? Project 11411 is not very large, so it would be surprising if one of the GPU kernels was exceeding the windows timeout on a GTX 970.