Project: 5755 (Run 10, Clone 195, Gen 12)

Moderators: Site Moderators, FAHC Science Team

Post Reply
Xilikon
Posts: 155
Joined: Sun Dec 02, 2007 1:34 pm

Project: 5755 (Run 10, Clone 195, Gen 12)

Post by Xilikon »

Got a unit which EUEd non-stop until it stop. Since it cnanot upload the partial results, the servers is never notified :

Code: Select all

[00:55:46] Board Type: Nvidia
[00:55:46] Core      : 
[00:55:46] Preparing to commence simulation
[00:55:46] - Looking at optimizations...
[00:55:46] - Created dyn
[00:55:46] - Files status OK
[00:55:46] - Expanded 96525 -> 489240 (decompressed 506.8 percent)
[00:55:46] Called DecompressByteArray: compressed_data_size=96525 data_size=489240, decompressed_data_size=489240 diff=0
[00:55:46] - Digital signature verified
[00:55:46] 
[00:55:46] Project: 5755 (Run 10, Clone 195, Gen 12)
[00:55:46] 
[00:55:46] Assembly optimizations on if available.
[00:55:46] Entering M.D.
[00:55:52] Working on Protein
[00:55:56] Client config found, loading data.
[00:55:56] Starting GUI Server
[00:57:46] Completed 1%
[00:57:46] mdrun_gpu returned 
[00:57:46] NANs detected on GPU
[00:57:46] 
[00:57:46] Folding@home Core Shutdown: UNSTABLE_MACHINE
[00:57:50] CoreStatus = 7A (122)
[00:57:50] Sending work to server
[00:57:50] Project: 5755 (Run 10, Clone 195, Gen 12)
[00:57:50] - Read packet limit of 540015616... Set to 524286976.
[00:57:50] - Error: Could not get length of results file work/wuresults_07.dat
[00:57:50] - Error: Could not read unit 07 file. Removing from queue.
[00:57:50] Trying to send all finished work units
[00:57:50] + No unsent completed units remaining.
[00:57:50] - Preparing to get new work unit...
[00:57:50] + Attempting to get work packet
[00:57:50] - Will indicate memory of 2046 MB
[00:57:50] - Connecting to assignment server
[00:57:50] Connecting to http://assign-GPU.stanford.edu:8080/
[00:57:50] Posted data.
[00:57:50] Initial: 43AB; - Successful: assigned to (171.67.108.11).
[00:57:50] + News From Folding@Home: GPU folding beta
[00:57:51] Loaded queue successfully.
[00:57:51] Connecting to http://171.67.108.11:8080/
[00:57:51] Posted data.
[00:57:51] Initial: 0000; - Receiving payload (expected size: 97037)
[00:57:52] - Downloaded at ~94 kB/s
[00:57:52] - Averaged speed for that direction ~89 kB/s
[00:57:52] + Received work.
[00:57:52] Trying to send all finished work units
[00:57:52] + No unsent completed units remaining.
[00:57:52] + Closed connections
[00:57:57] 
[00:57:57] + Processing work unit
[00:57:57] Core required: FahCore_11.exe
[00:57:57] Core found.
[00:57:57] Working on queue slot 08 [January 14 00:57:57 UTC]
[00:57:57] + Working ...
[00:57:57] - Calling '.\FahCore_11.exe -dir work/ -suffix 08 -priority 96 -nocpulock -checkpoint 15 -verbose -lifeline 828 -version 623'

[00:57:57] 
[00:57:57] *------------------------------*
[00:57:57] Folding@Home GPU Core - Beta
[00:57:57] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[00:57:57] 
[00:57:57] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[00:57:57] Build host: amoeba
[00:57:57] Board Type: Nvidia
[00:57:57] Core      : 
[00:57:57] Preparing to commence simulation
[00:57:57] - Looking at optimizations...
[00:57:57] - Created dyn
[00:57:57] - Files status OK
[00:57:57] - Expanded 96525 -> 489240 (decompressed 506.8 percent)
[00:57:57] Called DecompressByteArray: compressed_data_size=96525 data_size=489240, decompressed_data_size=489240 diff=0
[00:57:57] - Digital signature verified
[00:57:57] 
[00:57:57] Project: 5755 (Run 10, Clone 195, Gen 12)
[00:57:57] 
[00:57:57] Assembly optimizations on if available.
[00:57:57] Entering M.D.
[00:58:03] Working on Protein
[00:58:07] Client config found, loading data.
[00:58:07] Starting GUI Server
[00:59:57] Completed 1%
[00:59:57] mdrun_gpu returned 
[00:59:57] NANs detected on GPU
[00:59:57] 
[00:59:57] Folding@home Core Shutdown: UNSTABLE_MACHINE
[01:00:01] CoreStatus = 7A (122)
[01:00:01] Sending work to server
[01:00:01] Project: 5755 (Run 10, Clone 195, Gen 12)
[01:00:01] - Read packet limit of 540015616... Set to 524286976.
[01:00:01] - Error: Could not get length of results file work/wuresults_08.dat
[01:00:01] - Error: Could not read unit 08 file. Removing from queue.
[01:00:01] EUE limit exceeded. Pausing 24 hours.
Image
toTOW
Site Moderator
Posts: 6433
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project: 5755 (Run 10, Clone 195, Gen 12)

Post by toTOW »

There are 9 reports of immediate failure in the DB ... I've marked the WU as bad.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Birdman86
Posts: 7
Joined: Sun Oct 05, 2008 10:13 pm

Re: Project: 5755 (Run 10, Clone 195, Gen 12)

Post by Birdman86 »

This WU is again stopping clients and found my GeForce 8800 GT while I left it folding alone for the weekend. :(
So this WU hasn't been removed yet, but my error message seems to be different and I can't get 1% done. So maybe someone has tried to repair the WU, but not correctly.

Maybe I've got bad luck since I've caught two bad WU's with my two GPU's in a week.
This time it took 21 hours before I got home and could restart the client. Now the GPU is folding other WU well after client restart.
Could this be caused by my old 6.20 clients?

Code: Select all

[21:28:54] Completed 96%
[21:30:11] Completed 97%
[21:31:27] Completed 98%
[21:32:45] Completed 99%
[21:34:01] Completed 100%
[21:34:02] Successful run
[21:34:02] DynamicWrapper: Finished Work Unit: sleep=10000
[21:34:12] Reserved 78792 bytes for xtc file; Cosm status=0
[21:34:12] Allocated 78792 bytes for xtc file
[21:34:12] - Reading up to 78792 from "work/wudata_01.xtc": Read 78792
[21:34:12] Read 78792 bytes from xtc file; available packet space=786351672
[21:34:12] xtc file hash check passed.
[21:34:12] Reserved 23472 23472 786351672 bytes for arc file=<work/wudata_01.trr> Cosm status=0
[21:34:12] Allocated 23472 bytes for arc file
[21:34:12] - Reading up to 23472 from "work/wudata_01.trr": Read 23472
[21:34:12] Read 23472 bytes from arc file; available packet space=786328200
[21:34:12] trr file hash check passed.
[21:34:12] Allocated 560 bytes for edr file
[21:34:12] Read bedfile
[21:34:12] edr file hash check passed.
[21:34:12] Allocated 10712 bytes for logfile
[21:34:12] Read logfile
[21:34:12] GuardedRun: success in DynamicWrapper
[21:34:12] GuardedRun: done
[21:34:12] Run: GuardedRun completed.
[21:34:14] - Writing 114048 bytes of core data to disk...
[21:34:14] Done: 113536 -> 107458 (compressed to 94.6 percent)
[21:34:14]   ... Done.
[21:34:14] - Shutting down core 
[21:34:14] 
[21:34:14] Folding@home Core Shutdown: FINISHED_UNIT
[21:34:17] CoreStatus = 64 (100)
[21:34:17] Sending work to server
[21:34:17] Project: 5762 (Run 12, Clone 7, Gen 151)
[21:34:17] - Read packet limit of 540015616... Set to 524286976.


[21:34:17] + Attempting to send results [January 17 21:34:17 UTC]
[21:34:21] + Results successfully sent
[21:34:21] Thank you for your contribution to Folding@Home.
[21:34:21] + Number of Units Completed: 916

[21:34:25] - Preparing to get new work unit...
[21:34:25] + Attempting to get work packet
[21:34:25] - Connecting to assignment server
[21:34:26] - Successful: assigned to (171.67.108.11).
[21:34:26] + News From Folding@Home: GPU folding beta
[21:34:26] Loaded queue successfully.
[21:34:29] + Closed connections
[21:34:29] 
[21:34:29] + Processing work unit
[21:34:29] Core required: FahCore_11.exe
[21:34:29] Core found.
[21:34:29] Working on queue slot 02 [January 17 21:34:29 UTC]
[21:34:29] + Working ...
[21:34:29] 
[21:34:29] *------------------------------*
[21:34:29] Folding@Home GPU Core - Beta
[21:34:29] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[21:34:29] 
[21:34:29] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[21:34:29] Build host: amoeba
[21:34:29] Board Type: Nvidia
[21:34:29] Core      : 
[21:34:29] Preparing to commence simulation
[21:34:29] - Looking at optimizations...
[21:34:29] - Created dyn
[21:34:29] - Files status OK
[21:34:29] - Expanded 96525 -> 489240 (decompressed 506.8 percent)
[21:34:29] Called DecompressByteArray: compressed_data_size=96525 data_size=489240, decompressed_data_size=489240 diff=0
[21:34:29] - Digital signature verified
[21:34:29] 
[21:34:29] Project: 5755 (Run 10, Clone 195, Gen 12)
[21:34:29] 
[21:34:29] Assembly optimizations on if available.
[21:34:29] Entering M.D.
[21:34:36] Working on Protein
[21:34:39] Client config found, loading data.
[21:34:40] Starting GUI Server
[21:34:40] mdrun_gpu returned 
[21:34:40] SHAKE violations on GPU
[21:34:40] 
[21:34:40] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:34:43] CoreStatus = 7A (122)
[21:34:43] Sending work to server
[21:34:43] Project: 5755 (Run 10, Clone 195, Gen 12)
[21:34:43] - Read packet limit of 540015616... Set to 524286976.
[21:34:43] - Error: Could not get length of results file work/wuresults_02.dat
[21:34:43] - Error: Could not read unit 02 file. Removing from queue.
[21:34:43] - Preparing to get new work unit...
[21:34:43] + Attempting to get work packet
[21:34:43] - Connecting to assignment server
[21:34:44] - Successful: assigned to (171.67.108.11).
[21:34:44] + News From Folding@Home: GPU folding beta
[21:34:44] Loaded queue successfully.
[21:34:46] + Closed connections
[21:34:51] 
[21:34:51] + Processing work unit
[21:34:51] Core required: FahCore_11.exe
[21:34:51] Core found.
[21:34:51] Working on queue slot 03 [January 17 21:34:51 UTC]
[21:34:51] + Working ...
[21:34:52] 
[21:34:52] *------------------------------*
[21:34:52] Folding@Home GPU Core - Beta
[21:34:52] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[21:34:52] 
[21:34:52] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[21:34:52] Build host: amoeba
[21:34:52] Board Type: Nvidia
[21:34:52] Core      : 
[21:34:52] Preparing to commence simulation
[21:34:52] - Looking at optimizations...
[21:34:52] - Created dyn
[21:34:52] - Files status OK
[21:34:52] - Expanded 96525 -> 489240 (decompressed 506.8 percent)
[21:34:52] Called DecompressByteArray: compressed_data_size=96525 data_size=489240, decompressed_data_size=489240 diff=0
[21:34:52] - Digital signature verified
[21:34:52] 
[21:34:52] Project: 5755 (Run 10, Clone 195, Gen 12)
[21:34:52] 
[21:34:52] Assembly optimizations on if available.
[21:34:52] Entering M.D.
[21:34:59] Working on Protein
[21:35:03] Client config found, loading data.
[21:35:03] Starting GUI Server
[21:35:03] mdrun_gpu returned 
[21:35:03] SHAKE violations on GPU
[21:35:03] 
[21:35:03] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:35:06] CoreStatus = 7A (122)
[21:35:06] Sending work to server
[21:35:06] Project: 5755 (Run 10, Clone 195, Gen 12)
[21:35:06] - Read packet limit of 540015616... Set to 524286976.
[21:35:06] - Error: Could not get length of results file work/wuresults_03.dat
[21:35:06] - Error: Could not read unit 03 file. Removing from queue.
[21:35:06] - Preparing to get new work unit...
[21:35:06] + Attempting to get work packet
[21:35:06] - Connecting to assignment server
[21:35:07] - Successful: assigned to (171.67.108.11).
[21:35:07] + News From Folding@Home: GPU folding beta
[21:35:07] Loaded queue successfully.
[21:35:10] + Closed connections
[21:35:15] 
[21:35:15] + Processing work unit
[21:35:15] Core required: FahCore_11.exe
[21:35:15] Core found.
[21:35:15] Working on queue slot 04 [January 17 21:35:15 UTC]
[21:35:15] + Working ...
[21:35:15] 
[21:35:15] *------------------------------*
[21:35:15] Folding@Home GPU Core - Beta
[21:35:15] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[21:35:15] 
[21:35:15] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[21:35:15] Build host: amoeba
[21:35:15] Board Type: Nvidia
[21:35:15] Core      : 
[21:35:15] Preparing to commence simulation
[21:35:15] - Looking at optimizations...
[21:35:15] - Created dyn
[21:35:15] - Files status OK
[21:35:15] - Expanded 96525 -> 489240 (decompressed 506.8 percent)
[21:35:15] Called DecompressByteArray: compressed_data_size=96525 data_size=489240, decompressed_data_size=489240 diff=0
[21:35:15] - Digital signature verified
[21:35:15] 
[21:35:15] Project: 5755 (Run 10, Clone 195, Gen 12)
[21:35:15] 
[21:35:15] Assembly optimizations on if available.
[21:35:15] Entering M.D.
[21:35:22] Working on Protein
[21:35:25] Client config found, loading data.
[21:35:25] Starting GUI Server
[21:35:26] mdrun_gpu returned 
[21:35:26] SHAKE violations on GPU
[21:35:26] 
[21:35:26] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:35:29] CoreStatus = 7A (122)
[21:35:29] Sending work to server
[21:35:29] Project: 5755 (Run 10, Clone 195, Gen 12)
[21:35:29] - Read packet limit of 540015616... Set to 524286976.
[21:35:29] - Error: Could not get length of results file work/wuresults_04.dat
[21:35:29] - Error: Could not read unit 04 file. Removing from queue.
[21:35:29] - Preparing to get new work unit...
[21:35:29] + Attempting to get work packet
[21:35:29] - Connecting to assignment server
[21:35:30] - Successful: assigned to (171.67.108.11).
[21:35:30] + News From Folding@Home: GPU folding beta
[21:35:30] Loaded queue successfully.
[21:35:33] + Closed connections
[21:35:38] 
[21:35:38] + Processing work unit
[21:35:38] Core required: FahCore_11.exe
[21:35:38] Core found.
[21:35:38] Working on queue slot 05 [January 17 21:35:38 UTC]
[21:35:38] + Working ...
[21:35:38] 
[21:35:38] *------------------------------*
[21:35:38] Folding@Home GPU Core - Beta
[21:35:38] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[21:35:38] 
[21:35:38] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[21:35:38] Build host: amoeba
[21:35:38] Board Type: Nvidia
[21:35:38] Core      : 
[21:35:38] Preparing to commence simulation
[21:35:38] - Looking at optimizations...
[21:35:38] - Created dyn
[21:35:38] - Files status OK
[21:35:38] - Expanded 96525 -> 489240 (decompressed 506.8 percent)
[21:35:38] Called DecompressByteArray: compressed_data_size=96525 data_size=489240, decompressed_data_size=489240 diff=0
[21:35:38] - Digital signature verified
[21:35:38] 
[21:35:38] Project: 5755 (Run 10, Clone 195, Gen 12)
[21:35:38] 
[21:35:38] Assembly optimizations on if available.
[21:35:38] Entering M.D.
[21:35:46] Working on Protein
[21:35:49] Client config found, loading data.
[21:35:49] Starting GUI Server
[21:35:49] mdrun_gpu returned 
[21:35:49] SHAKE violations on GPU
[21:35:49] 
[21:35:49] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:35:52] CoreStatus = 7A (122)
[21:35:52] Sending work to server
[21:35:52] Project: 5755 (Run 10, Clone 195, Gen 12)
[21:35:52] - Read packet limit of 540015616... Set to 524286976.
[21:35:52] - Error: Could not get length of results file work/wuresults_05.dat
[21:35:52] - Error: Could not read unit 05 file. Removing from queue.
[21:35:52] - Preparing to get new work unit...
[21:35:52] + Attempting to get work packet
[21:35:52] - Connecting to assignment server
[21:35:53] - Successful: assigned to (171.67.108.11).
[21:35:53] + News From Folding@Home: GPU folding beta
[21:35:53] Loaded queue successfully.
[21:35:56] + Closed connections
[21:36:01] 
[21:36:01] + Processing work unit
[21:36:01] Core required: FahCore_11.exe
[21:36:01] Core found.
[21:36:01] Working on queue slot 06 [January 17 21:36:01 UTC]
[21:36:01] + Working ...
[21:36:01] 
[21:36:01] *------------------------------*
[21:36:01] Folding@Home GPU Core - Beta
[21:36:01] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[21:36:01] 
[21:36:01] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[21:36:01] Build host: amoeba
[21:36:01] Board Type: Nvidia
[21:36:01] Core      : 
[21:36:01] Preparing to commence simulation
[21:36:01] - Looking at optimizations...
[21:36:01] - Created dyn
[21:36:01] - Files status OK
[21:36:01] - Expanded 96525 -> 489240 (decompressed 506.8 percent)
[21:36:01] Called DecompressByteArray: compressed_data_size=96525 data_size=489240, decompressed_data_size=489240 diff=0
[21:36:01] - Digital signature verified
[21:36:01] 
[21:36:01] Project: 5755 (Run 10, Clone 195, Gen 12)
[21:36:01] 
[21:36:01] Assembly optimizations on if available.
[21:36:01] Entering M.D.
[21:36:09] Working on Protein
[21:36:12] Client config found, loading data.
[21:36:12] Starting GUI Server
[21:36:12] mdrun_gpu returned 
[21:36:12] SHAKE violations on GPU
[21:36:12] 
[21:36:12] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:36:15] CoreStatus = 7A (122)
[21:36:15] Sending work to server
[21:36:15] Project: 5755 (Run 10, Clone 195, Gen 12)
[21:36:15] - Read packet limit of 540015616... Set to 524286976.
[21:36:15] - Error: Could not get length of results file work/wuresults_06.dat
[21:36:15] - Error: Could not read unit 06 file. Removing from queue.
[21:36:15] EUE limit exceeded. Pausing 24 hours.
klasseng
Posts: 126
Joined: Thu Dec 27, 2007 6:08 am
Hardware configuration: System 1: Mac Studio, M1 Max,
System 2: Mac Mini, M2
Location: Canada

Re: Project: 5755 (Run 10, Clone 195, Gen 12)

Post by klasseng »

Project 5755 (Run 10, Clone 195, Gen 12) also crashes out on my 9600GSO. Running on with v6.32 client.

It's STILL BAD . . . 6 days later.
peace,
Grant
Post Reply