Page 1 of 1

Project: 2665 (Run 2, Clone 464, Gen 60) [ERROR 0x7b]

Posted: Tue Nov 04, 2008 5:19 pm
by ei57
WU EUE's after 40% - has done so 3 times.

Code: Select all

[15:24:29] Project: 2665 (Run 2, Clone 464, Gen 60)
[15:24:29] 
[15:24:33] Entering M.D.
[15:24:40] Calling FAH init
[15:24:42] Read topology
[15:24:42] s
[15:24:42] Writing local files
[15:24:42] Completed 100000 out of 250000 steps  (40 percent)
[15:24:42] ns
[15:24:42] Writing local files
[15:24:42] Completed 100000 out of 250000 steps  (40 percent)
[15:24:51] Extra SSE boost OK.
[15:38:37] Gromacs cannot- Writing 94027 bytes of core data to disk...
[15:38:37]   ... Done.
[15:38:37] - Failed to delete work/wudata_05.sas
[15:38:37] - Failed to delete work/wudata_05.goe
[15:38:37] - Failed to delete work/wudata_05.pdo
[15:38:37] Warning:  check for stray files
[15:38:37] goe
[15:38:37] Warning:  check for stray files
[15:38:37] NIT_END
[15:38:37] Finalizing output
[15:40:37] n: EARLY_UNIT_END
[15:40:37] 
[15:40:37] Folding@home Core Shutdown: EARLY_UNIT_END
[15:40:41] CoreStatus = 7B (123)
[15:40:41] Client-core communications error: ERROR 0x7b
[15:40:41] Deleting current work unit & continuing...
[15:40:41] Using generic mpiexec calls
[15:42:45] - Warning: Could not delete all work unit files (5): Core returned invalid code

Re: 2665 (Run 2, Clone 464, Gen 60)

Posted: Tue Nov 04, 2008 7:06 pm
by toTOW
Upgrade your client to 6.23 ...

There no data for this WU in the DB yet.

Re: Project: 2665 (Run 2, Clone 464, Gen 60) [ERROR 0x7b]

Posted: Wed Nov 05, 2008 8:28 am
by 7im
That will handle the EUE better, but it won't stop the problem of a networking change from causing the ERROR 0x7b error.

Re: Project: 2665 (Run 2, Clone 464, Gen 60) [ERROR 0x7b]

Posted: Wed Nov 05, 2008 8:52 pm
by ei57
7im wrote:That will handle the EUE better, but it won't stop the problem of a networking change from causing the ERROR 0x7b error.
What is the probability that a network change occurs after the WU has reached 40% completion and before it is 41% complete? Once OK, but three times? And shouldn't the loopback solve the network issues?

The 2665 WU's have caused me to develop a habit of backing up the work directory and some additional files. As they have a tendency to EUE, a behaviour not observed in other SMP WU's, the backup has been useful. Instead of starting from scratch, restore, restart and usually the WU will complete. This one completed 40% period. I should add that my experience with "other SMP WU's" may not be valid for statistical purposes.

I have updated the client, so in the future, I should expect to move on to a new WU after an EUE?