Project: 2665 (Run 3, Clone 792, Gen 37)

Moderators: Site Moderators, FAHC Science Team

Post Reply
ElectricVehicle
Posts: 157
Joined: Fri Feb 01, 2008 6:41 pm

Project: 2665 (Run 3, Clone 792, Gen 37)

Post by ElectricVehicle »

I just got the Warning: Long 1-4 interactions error on Project: 2665 (Run 3, Clone 792, Gen 37) at 6 percent. Then it just sat there for hours chewing CPU but making no progress, and not crashing either.

Makes me wonder if there is any value in cataloging which Project: 2665 WU's fail by (Run, Clone, Gen) and including the percent at which they fail?

Or some other way to know when to dump these units or let them run their course in case they actually are still doing science.

I guess for me, if the unit is normally doing 20 min/% and then I see the Warning: Long 1-4 interactions error, and it's been working for over an hour and has not progressed another percent - that's when I declare it dead and purge the presumed bad WU to get another one. During the purge process, it's possible Stanford might not get any results from the WU, for example deleting the work folder and queue.dat.

Maybe when this warning pops up, the client should start a dead man's timer for the unit and shoot the unit dead if the timer expires before any new progress and then send a report back to Stanford automatically. Purging the units manually can solve our immediate folding problem if we're watching our client but it may be valuable for Stanford to see that the WU was bad so they can learn how to improve the science simulation.

I'm just thinking with no feedback to Stanford on bad WU's that this might allow the issue to be under reported, underestimated and delay any necessary fixes or changes. Also the client should automatically deal with these in a reasonable way and not leave the client burning CPU for hours on end on a WU that is detectably bad. Bad detection, if by no other means, the unit taking too long and checking with a dead man's timer.
Fold On! (with 100% Renewable, 0 Carbon electricity) ElectricVehicle EV1, RAV4 EV, LEAF, Bolt EV, Volt, M3, s4 Simulator
MarkAGr
Posts: 1
Joined: Mon Nov 16, 2009 6:03 am

Re: Long 1-4 interactions error on 2665

Post by MarkAGr »

Looks like an old thread is re-starting...

A year on ... and quantum theory strikes again :)

49% in and "ping" - bubble burst.
No overclocking going on here. Wonderfully ventilated case and CPU < 50C. Gotta be the physics.

:roll:
Mark
Post Reply