Search found 339 matches

by JohnChodera
Sun Jan 08, 2017 7:31 am
Forum: Issues with a specific server
Topic: 140.163.4.231 not actually downloading work
Replies: 8
Views: 2054

Re: 140.163.4.231 not actually downloading work

Thanks for the heads-up. I'm investigating.

Is this the only server you're having trouble with? I'm showing lots of WUs moving in and out of the server (plfah1-1.mskcc.org) just fine.
by JohnChodera
Thu Dec 22, 2016 7:03 pm
Forum: Issues with a specific WU
Topic: Project 10496 impossible to finish
Replies: 17
Views: 4342

Re: Project 10496 impossible to finish

Thanks for all the feedback, everyone. This is a HUGE system---much bigger than we normally simulate---and we seem to have underestimated the deadline and timeout as a result. I've temporarily set the assignment weight for this project to 0 until we can figure out (1) what an appropriate deadline/ti...
by JohnChodera
Mon Nov 21, 2016 3:38 am
Forum: Scientific discussions (Non-FAH)
Topic: XFEL - Folding could be "filmed" in the near future ?
Replies: 3
Views: 3391

Re: XFEL - Folding could be "filmed" in the near future ?

Lots of structural biologists are excited for this to come online, but I believe the first X-ray free-electron laser (XFEL) for use in biomolecular crystallography is actually at Stanford and went into service in 2010: https://portal.slac.stanford.edu/sites/ ... fault.aspx https://www6.slac.stanford...
by JohnChodera
Tue Oct 25, 2016 10:55 pm
Forum: GPU Projects and FahCores
Topic: When the core 22 will be tested?
Replies: 45
Views: 15088

Re: When the core 22 will be tested?

Thanks for your interest! We're in the very early stages of building test builds for a new GPU core 22 based on OpenMM 7.0.1, but we don't yet have a timeline other than "soon!". There are a few technical and infrastructure hurdles to straighten out first.
by JohnChodera
Sat Mar 26, 2016 12:28 am
Forum: Issues with a specific server
Topic: Cant upload the WU in the time allowed.
Replies: 52
Views: 10790

Re: Cant upload the WU in the time allowed.

I've just doubled the timeout (to 1 hour). Please report any other failed uploads and how long they took to timeout. There may be a bug causing the early timeout.
by JohnChodera
Sat Mar 26, 2016 12:27 am
Forum: Issues with a specific server
Topic: Cant upload the WU in the time allowed.
Replies: 52
Views: 10790

Re: Cant upload the WU in the time allowed.

That's odd. This server is set to use a 30 minute timeout, but it looks like the upload timed out after ~13 minutes. That's really strange. Note that these servers don't answer to pings from the outside world because of firewall restrictions. We're working to try to get these lifted to make debuggin...
by JohnChodera
Sat Mar 26, 2016 12:23 am
Forum: Issues with a specific server
Topic: Cant upload the WU in the time allowed.
Replies: 52
Views: 10790

Re: Cant upload the WU in the time allowed.

Thanks everybody for alerting us to issues with the MSK servers (140.163.4.24x). Looking into this now.
by JohnChodera
Tue Feb 16, 2016 6:27 pm
Forum: Issues with a specific server
Topic: 140.163.4.242 slow WU dl
Replies: 1
Views: 1237

Re: 140.163.4.242 slow WU dl

Wow, that's *really* slow. Let me check with our networking folks to see if there is something weird going on.

John
by JohnChodera
Tue Jan 19, 2016 4:51 pm
Forum: Issues with a specific WU
Topic: No credit P10468 and P10484
Replies: 3
Views: 1251

Re: No credit P10468 and P10484

Server fixed. Points should be credited soon.

Thanks again for your help!

J
by JohnChodera
Tue Jan 19, 2016 4:27 pm
Forum: Issues with a specific WU
Topic: No credit P10468 and P10484
Replies: 3
Views: 1251

Re: No credit P10468 and P10484

So sorry about this! Many thanks for the report---I see these servers are suddenly not allowing inbound ssh connections, which is why credit hasn't been applied yet.

Working on fixing thins now.

J
by JohnChodera
Sun Jan 17, 2016 10:45 pm
Forum: GPU Projects and FahCores
Topic: Core 21 Bad Work Unit issues
Replies: 20
Views: 6969

Re: Core 21 Bad Work Unit issues

Sorry you're having trouble here! (And thanks to toTOW for helping debug---we've been communicating about this.) Unfortunately, the current core 21 maps a number of errors to the same "BAD_WORK_UNIT (114 = 0x72)" return value. We're working on fixing this to provide more useful information...
by JohnChodera
Wed Dec 30, 2015 6:17 pm
Forum: Announcements - Folding Consortium
Topic: Core21 v0.0.17 has been released!
Replies: 0
Views: 2604

Core21 v0.0.17 has been released!

We rolled out Core21 v0.0.17 earlier this week. Summary of changes: * Checkpoint sanity checks are now multithreaded (when supported by hardware). When possible, the multithreaded OpenMM CPU platform is now used for sanity checks prior to checkpointing instead of the much slower single-threaded Refe...
by JohnChodera
Mon Dec 07, 2015 6:41 pm
Forum: Announcements - Folding Consortium
Topic: Core21 v0.0.14 has been released!
Replies: 0
Views: 2093

Core21 v0.0.14 has been released!

We rolled out Core21 v0.0.14 last week. Main features are: * Bad State errors fixed: Improvements that drastically reduce the high rate of Bad State errors we were seeing with earlier versions of the core, especially with NVIDIA cards. These Bad State errors were ironically caused by a couple of bug...
by JohnChodera
Mon Dec 07, 2015 4:53 pm
Forum: Issues with a specific server
Topic: Unable to download new WU
Replies: 63
Views: 16379

Re: Unable to download new WU

I think we may be running low on full FAH jobs since all core 21 projects were rolled back to ADV or earlier. My lab is advancing a few projects to FAH to help make sure sufficient jobs are available.

Will check in with Stanford a bit later today to make sure there are now AS/WS issues.
by JohnChodera
Sat Dec 05, 2015 4:39 am
Forum: Issues with a specific WU
Topic: Core 21 failures on GTX970
Replies: 23
Views: 9349

Re: Core 21 failures on GTX970

> This is the TDR issue ... now, is triggered by faulty hardware or by software ?

Are you sure this is the same as the TDR? Project 11411 is not very large, so it would be surprising if one of the GPU kernels was exceeding the windows timeout on a GTX 970.