Page 1 of 1

Project: 10633 (Run 96, Clone 5, Gen 12)

Posted: Mon Jul 19, 2010 1:03 pm
by JimF
This may not be a problem with the work unit at all, but I am getting NAN and EUE errors, which I have never gotten on GPU3 before. Before I start mucking around with my hardware (dedicated GT240 which runs cool, below 60 C in a cool basement), I am wondering if others are seeing this. The only time I have ever gotten EUEs before was when I was setting up the PC and had not put dummy plugs in yet. I am running WinXP on a quad-core with a P45 chipset and everything has been solid for months.

Code: Select all

[03:58:18] Folding@Home GPU Core -- Beta
[03:58:18] Version 2.09 (Thu May 20 11:51:02 PDT 2010)
[03:58:18] 
[03:58:18] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42 for 80x86 
[03:58:18] Build host: amoeba
[03:58:18] Board Type: Nvidia
[03:58:18] Core      : 
[03:58:18] Preparing to commence simulation
[03:58:18] - Looking at optimizations...
[03:58:18] DeleteFrameFiles: successfully deleted file=work/wudata_07.ckp
[03:58:18] - Created dyn
[03:58:18] - Files status OK
[03:58:18] sizeof(CORE_PACKET_HDR) = 512 file=<>
[03:58:18] - Expanded 28909 -> 163067 (decompressed 564.0 percent)
[03:58:18] Called DecompressByteArray: compressed_data_size=28909 data_size=163067, decompressed_data_size=163067 diff=0
[03:58:18] - Digital signature verified
[03:58:18] 
[03:58:18] Project: 10633 (Run 96, Clone 5, Gen 12)
[03:58:18] 
[03:58:18] Assembly optimizations on if available.
[03:58:18] Entering M.D.
[03:58:24] Tpr hash work/wudata_07.tpr:  1824575913 1505624335 238161398 2940370698 3583626082
[03:58:24] Working on 582 p2750_N68H_AM03
[03:58:24] Client config found, loading data.
[03:58:25] Starting GUI Server
[03:58:28] mdrun_gpu returned 
[03:58:28] NANs detected on GPU
[03:58:28] 
[03:58:28] Folding@home Core Shutdown: UNSTABLE_MACHINE
[03:58:30] CoreStatus = 7A (122)
[03:58:30] Sending work to server
[03:58:30] Project: 10633 (Run 96, Clone 5, Gen 12)
[03:58:30] - Read packet limit of 540015616... Set to 524286976.
[03:58:30] - Error: Could not get length of results file work/wuresults_07.dat
[03:58:30] - Error: Could not read unit 07 file. Removing from queue.
[03:58:30] EUE limit exceeded. Pausing 24 hours.

Re: Project: 10633 (Run 96, Clone 5, Gen 12)

Posted: Mon Jul 19, 2010 3:36 pm
by 7im
EUEs tend to be WU problems, NaNs tend to be hardware problems. And yes, others (a very few) have seen issues upgrading to GPU3, but it's typically driver related, not client related. That whole ati vs amd .dll naming game a while back.

More info might help. What driver version? Console or Systray? What PSU? Is the CPU overclocked also?

v289.xx beta drivers are reported to not need any dummy plugs, etc., to work so it may help.

Re: Project: 10633 (Run 96, Clone 5, Gen 12)

Posted: Mon Jul 19, 2010 4:01 pm
by JimF
Drivers: 197.45
Systray version (Folding@home-systray-632)
Power Supply: SeaSonic S12II 330 Bronze 330W (I have measured the maximum power at the plug at about 220 watts, so no problems there.)
Zotac GT240 DDR3 (passive cooling with a case fan blowing on it; never gets above 60C): Not overclocked

This dedicated folding PC has worked stably for several months and has a total of three GT240's. The other two are not having any problems.
I have rebooted and deleted the work folder and queue.dat for the problem card, but it picked up the same project and gave the same error.

I am inclined to humor it and wait 24 hours, and then try the problem card in another PC if necessary unless someone can think of a better idea.

Re: Project: 10633 (Run 96, Clone 5, Gen 12)

Posted: Mon Jul 19, 2010 4:27 pm
by toTOW
Someone else completed this WU successfully.

Re: Project: 10633 (Run 96, Clone 5, Gen 12)

Posted: Mon Jul 19, 2010 4:47 pm
by JimF
Thanks for checking, I should have done that first.
But I saw that the new 258.96 drivers just came out, so I tried those, but no luck; same error.
It is looking more and more like a hardware problem. Maybe a GTX 460 is in my future.

Re: Project: 10633 (Run 96, Clone 5, Gen 12)

Posted: Mon Jul 19, 2010 4:48 pm
by PantherX
Can you run the GPU2 WUs without any problems? just add the -advmethods flag to the GPU3 BETA Client

Re: Project: 10633 (Run 96, Clone 5, Gen 12)

Posted: Mon Jul 19, 2010 4:53 pm
by JimF
PantherX wrote:Can you run the GPU2 WUs without any problems? just add the -advmethods flag to the GPU3 BETA Client
I have enabled the "Allow receipt of work assignment greater than 10MB" all along. Is that the same?
I used to pick up some GPU2 WUs once in a while shortly after GPU3 BETA came out, but haven't seen them since. They all ran OK then.

But I will try the flag on that card in any case, just to check it out.

Re: Project: 10633 (Run 96, Clone 5, Gen 12)

Posted: Mon Jul 19, 2010 5:02 pm
by JimF
PanterX,

That did it! With -advmethods I picked up Project: 10513 (Run 4, Clone 999, Gen 26), and that is running fine. Why the other one didn't like my card is a mystery.

Thanks.

Re: Project: 10633 (Run 96, Clone 5, Gen 12)

Posted: Mon Jul 19, 2010 5:06 pm
by PantherX
JimF wrote:... Why the other one didn't like my card is a mystery...
I guess it is because it is still in BETA :wink: The good news is that you can utilize this GPU until the GPU2 WUs are over. If it can run the GPU3 WUs by then, good news for you. If it can't, you already would be having the GTX 460 (hopefully). Also it may be fixed in the FahCore_16 but there isn't any ETA.

Re: Project: 10633 (Run 96, Clone 5, Gen 12)

Posted: Mon Jul 19, 2010 5:15 pm
by JimF
Very good. I had never had any problem with GPU3 before, and had sort of taken it for granted. And the GTX 460 will look better when I need some more heat down here.

Re: Project: 10633 (Run 96, Clone 5, Gen 12)

Posted: Mon Jul 19, 2010 6:01 pm
by 7im
If you run in to more problems, you might consider running the memtestG80 tester to see if the GPU has developed any issues over time. http://folding.stanford.edu/English/DownloadUtils#ntoc2

Re: Project: 10633 (Run 96, Clone 5, Gen 12)

Posted: Mon Jul 19, 2010 6:14 pm
by JimF
Yes, I was beginning to think of something like that but had forgotten the name. Thanks for all your help.