Thank you for your beta test report regarding this persistent bug. Three running A1 cores isn't right, but so far nobody knows how to cause that to happen or how to prevent it from happening. Anything else that you discover may be the clue that allows the cause to be determined and a remedy to be programmed.
My theory is that after a WU reaches 100% and the work unit is finished, MPI is supposed to shut down all four copies of the core and return control to the client. For some reason not all cores shut down. (In your case, one probably terminated normally and the other three continued to run in spite of being told to quit.) I do not believe this has anything to do with the specific WU being processed, so the title of this thread may be inappropriate since the generic problem is that FAH hung after reaching 100%. Either way it's difficult to know.
Aardvark:
* Have you tried downloading the newest client (6.22 R3) from viewtopic.php?f=46&t=4913? This contains a fix that might help with this problem.
* I notice you've used the -pause parameter. I don't know if anybody tested that parameter thoroughly yet. Does the same problem occur if you remove that parameter?
Project: 2669 (R9, C106, G0) [Hang before sending]
Moderators: Site Moderators, FAHC Science Team
Re: Project: 2669 (R9, C106, G0) [Hang before sending]
I'm going to lock this thread and merge the essential discussion to the thread referenced above. The posts may no longer be close to each other, but they'll retain the titles and still be in chronological order. PLease continue the discussion there.
Posting FAH's log:
How to provide enough info to get helpful support.
How to provide enough info to get helpful support.