Page 1 of 1

EUE, Unstable Machine, Faulty Work Unit

Posted: Mon Feb 11, 2013 10:10 am
by hiigaran
So I get this a little too frequently:

Code: Select all

10:05:15:WU00:FS00:0x15:Run: exception thrown in GuardedRun -- cannot continue further. 
10:05:15:WU00:FS00:0x15:Going to send back what have done -- stepsTotalG=40000000 
10:05:15:WU00:FS00:0x15:Work fraction=0.1632 steps=40000000. 
10:05:19:WU00:FS00:0x15:logfile size=15131 infoLength=15131 edr=0 trr=23 
10:05:19:WU00:FS00:0x15:+ Opened results file 
10:05:19:WU00:FS00:0x15:- Writing 15667 bytes of core data to disk... 
10:05:20:WU00:FS00:0x15:Done: 15155 -> 4835 (compressed to 31.9 percent) 
10:05:20:WU00:FS00:0x15:  ... Done. 
10:05:20:WU00:FS00:0x15:DeleteFrameFiles: successfully deleted file=00/wudata_01.ckp 
10:05:20:WU00:FS00:0x15: 
10:05:20:WU00:FS00:0x15:Folding@home Core Shutdown: UNSTABLE_MACHINE 
10:05:20:WARNING:WU00:FS00:FahCore returned: UNSTABLE_MACHINE (122 = 0x7a) 
10:05:20:WU00:FS00:Sending unit results: id:00 state:SEND error:FAULTY project:7626 run:260 clone:0 gen:130 core:0x15 unit:0x000000a7664f2dd14fe61b3d5d802508
GTX 670, no overclock, no issues in games, and temperatures are never over 75. Yet I always get work units dumped after a driver crash, both before, and after upgrading drivers (currently 310.90).

Ideas?

Re: EUE, Unstable Machine, Faulty Work Unit

Posted: Mon Feb 11, 2013 11:57 am
by art_l_j_PlanetAMD64
hiigaran wrote:GTX 670, no overclock, no issues in games, and temperatures are never over 75. Yet I always get work units dumped after a driver crash, both before, and after upgrading drivers (currently 310.90).

Ideas?
The 310.xx drivers are a known problem on the NVidia GeForce Forums, please see this information about it. Many of those users are able to restore normal operation of their system by going back to version 306.97.

I use version 306.97 on all of my Windows 7 folding machines, which you can see here.

I would run version 306.97 on everything, but that version is not available for Windows XP (it has 306.81), and the Debian Linux/Wine/GPU3 systems require an older driver (275.43), to match the v6.41 GPU3 client and the v2 Linux kernel.

Folding is much more stressful on a GPU than anything else, so it is not unheard-of to have a situation like yours, where Folding is the only application experiencing a problem.

Art

Re: EUE, Unstable Machine, Faulty Work Unit

Posted: Mon Feb 11, 2013 12:39 pm
by hiigaran
See, that's the problem. I don't think the drivers are the issue. Yes, I'm having the issue on the 310.90 drivers, but I have also had the issue on two older drivers as well. I don't remember what specific versions they were, but I do know they had different major versions, and the last time I updated was probably half a year ago.

Re: EUE, Unstable Machine, Faulty Work Unit

Posted: Mon Feb 11, 2013 12:59 pm
by bollix47
Have you tried increasing the fan speed using a utility like MSIAfterburner?

The current temperature on my 670 is 59C but I have manually adjusted the fan to 70%. I too was getting driver crashes before decreasing the temperature. Although the max temperature spec for a 670 is 97C I think you may be having a heat problem at 75C. I know I've had GPUs that produced folding problems when the temperature got anywhere near 70C while others worked fine at that temperature or even much higher.

Also, have a look at this thread for similar error code.

Run a memory test.

Clean install.

Re: EUE, Unstable Machine, Faulty Work Unit

Posted: Mon Feb 11, 2013 2:09 pm
by hiigaran
I already use Afterburner for a custom fan profile. its at around 70-75% between 70 and 76C, but once it goes past that, it kicks itself up to 80% and beyond on a steep ramp. I've got it set to force a fan update every 5 seconds. If I leave the computer and the drivers crash, the fan speeds would revert to defaults without the forcing.

Re: EUE, Unstable Machine, Faulty Work Unit

Posted: Mon Feb 11, 2013 2:20 pm
by P5-133XL
It really sounds like you have a flaky card. If it is stock, you may have to underclock to make it more reliable. I agree that you should run that Memory test. It is not a perfect test, but it may help you verify a possible HW issue.

Re: EUE, Unstable Machine, Faulty Work Unit

Posted: Mon Feb 11, 2013 4:44 pm
by art_l_j_PlanetAMD64
bollix47 wrote:Have you tried increasing the fan speed using a utility like MSI Afterburner?

The current temperature on my 670 is 59C but I have manually adjusted the fan to 70%. I too was getting driver crashes before decreasing the temperature. Although the max temperature spec for a 670 is 97C I think you may be having a heat problem at 75C. I know I've had GPUs that produced folding problems when the temperature got anywhere near 70C while others worked fine at that temperature or even much higher.

Also, have a look at this thread for similar error code.

Run a memory test.

Clean install.
This is very good advice, I run all of my fan controls manually. Call me a worrywart, but I get nervous when any of my GPUs goes over 65C! :egeek: And that's even on the 762x WUs.

Re: EUE, Unstable Machine, Faulty Work Unit

Posted: Tue Feb 12, 2013 1:21 pm
by hiigaran
I still haven't gotten around to this, but I've been having other issues with F@H. I used to run the v7 client as a service, without any issues for either my SMP or GPU slots. However, two days ago, I turned on the computer, and noticed a distinct lack of fan noise. FAHControl was not in my systray, and the service had somehow disappeared from my services list. So I tried to open it manually from the installation directory (ie, clicking FAHControl). It opened, but it remained stuck on 'connecting'. So I uninstalled and reinstalled in the same way I had it before, with it starting as a service, and starting FAHControl on startup. SMP slot works fine, but the GPU slot is stuck on 'ready'. According to the logs, there is a GPU memtest error, but what I don't understand is why it would say that when I chose 'install as a service' in the installation, when after reinstalling again, I chose the recommended option (the first one...I forgot the name of it), and it works just fine.

Just a whole bunch of strange occurrences. Might explain the driver crashes I was having, because now that I'm not running it as a service, I haven't been getting them. But how did the service get removed in the first place?

Technology...

Re: EUE, Unstable Machine, Faulty Work Unit

Posted: Tue Feb 12, 2013 1:30 pm
by PantherX
If you are using Windows XP, then the GPU can fold if F@H has been installed as a service. However, if you are running Windows Vista or higher, then you can not install F@H as a Service and expect the GPU to fold. This is a security feature from Microsoft. However, the CPU can continue to fold even if you install F@H as a service in all Windows OS.

The default installation is to start F@H at boot time (not as a service).

You may want to post the log showing the error and the configuration of your system and client (initial section of the log). Furthermore, you can always do a fresh installation with default settings (Finish any WU if you have been assigned then uninstall the software and select the option to delete the data files) and see if that helps you.

Re: EUE, Unstable Machine, Faulty Work Unit

Posted: Sun Feb 17, 2013 1:33 pm
by hiigaran
Just an update on this, I disabled the beta slot option on my GPU, and I haven't had any issues since.