Project: 5745 (Run 0, Clone 59, Gen 228)

Moderators: Site Moderators, FAHC Science Team

Post Reply
SidVicious
Posts: 30
Joined: Sun Jan 13, 2008 10:14 pm

Project: 5745 (Run 0, Clone 59, Gen 228)

Post by SidVicious »

Another broken WU, EUE'd five times and issued the dreadful "Pausing 24 hours."

Code: Select all

[21:26:37] Project: 5745 (Run 0, Clone 59, Gen 228)
[21:26:37] 
[21:26:37] Assembly optimizations on if available.
[21:26:37] Entering M.D.
[21:26:43] Tpr hash work/wudata_00.tpr:  57746033 946412100 2250985240 2869848819 1138190429
[21:26:43] Working on Protein
[21:26:44] Client config found, loading data.
[21:26:44] Starting GUI Server
[21:26:47] mdrun_gpu returned 
[21:26:47] Nonzero force sum on GPU
[21:26:47] 
[21:26:47] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:26:49] CoreStatus = 7A (122)
[21:26:49] Sending work to server
[21:26:49] Project: 5745 (Run 0, Clone 59, Gen 228)
[21:26:49] - Read packet limit of 540015616... Set to 524286976.
[21:26:49] - Error: Could not get length of results file work/wuresults_00.dat
[21:26:49] - Error: Could not read unit 00 file. Removing from queue.
[21:26:49] EUE limit exceeded. Pausing 24 hours.
I'm doing science and I'm still folding
I feel FANTASTIC and I'm still folding
While you are dying I'll still be folding
and when you're dead I'll still be folding
STILL FOLDING, still folding
Tynat
Posts: 89
Joined: Wed Feb 11, 2009 1:37 am

Re: Project: 5745 (Run 0, Clone 59, Gen 228)

Post by Tynat »

There were four prior UNSTABLE_MACHINE failures with this WU previous to this one. It's unknown while the job stack is being rebuilt whether restarting the GPU client would result in receiving the same WU.

Code: Select all

[16:39:31] + Received work.
[16:39:31] Trying to send all finished work units
[16:39:31] + No unsent completed units remaining.
[16:39:31] + Closed connections
[16:39:36] 
[16:39:36] + Processing work unit
[16:39:36] Core required: FahCore_11.exe
[16:39:36] Core found.
[16:39:36] Working on queue slot 01 [July 4 16:39:36 UTC]
[16:39:36] + Working ...
[16:39:36] - Calling '.\FahCore_11.exe -dir work/ -suffix 01 -checkpoint 15 -verbose -lifeline 680 -version 623'

[16:39:37] 
[16:39:37] *------------------------------*
[16:39:37] Folding@Home GPU Core - Beta
[16:39:37] Version 1.24 (Mon Feb 9 11:00:12 PST 2009)
[16:39:37] 
[16:39:37] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[16:39:37] Build host: amoeba
[16:39:37] Board Type: AMD
[16:39:37] Core      : 
[16:39:37] Preparing to commence simulation
[16:39:37] - Looking at optimizations...
[16:39:37] - Created dyn
[16:39:37] - Files status OK
[16:39:37] - Expanded 68539 -> 357580 (decompressed 521.7 percent)
[16:39:37] Called DecompressByteArray: compressed_data_size=68539 data_size=357580, decompressed_data_size=357580 diff=0
[16:39:37] - Digital signature verified
[16:39:37] 
[16:39:37] Project: 5745 (Run 0, Clone 59, Gen 228)
[16:39:37] 
[16:39:37] Assembly optimizations on if available.
[16:39:37] Entering M.D.
[16:39:43] Tpr hash work/wudata_01.tpr:  57746033 946412100 2250985240 2869848819 1138190429
[16:39:43] Working on Protein
[16:39:44] Client config found, loading data.
[16:39:44] Starting GUI Server
[16:39:54] mdrun_gpu returned 
[16:39:54] Nonzero force sum on GPU
[16:39:54] 
[16:39:54] Folding@home Core Shutdown: UNSTABLE_MACHINE
[16:39:57] CoreStatus = 7A (122)
[16:39:57] Sending work to server
[16:39:57] Project: 5745 (Run 0, Clone 59, Gen 228)
[16:39:57] - Read packet limit of 540015616... Set to 524286976.
[16:39:57] - Error: Could not get length of results file work/wuresults_01.dat
[16:39:57] - Error: Could not read unit 01 file. Removing from queue.
[16:39:57] EUE limit exceeded. Pausing 24 hours.
All clients stopped due to Stanford's upcoming September 2011 decision
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 5745 (Run 0, Clone 59, Gen 228)

Post by bruce »

The WU (P5745,R0,C59,G228) has been reported as a bad WU.
Post Reply