FAULTY project:

Moderators: Site Moderators, FAHC Science Team

Yavanius
Posts: 121
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Post by Yavanius »

The WorkUnit failed when I restarted it...


1. If there was no OpenCL BOINC GPU wouldn't work either.
2. Folding shouldn't work at ALL... yet, like before, I got the one random unit that works for some reason. When I paused it to let the BOINC work clear out and returned to it this morning the workunit died...
3. As I repeatedly noted I tried reinstalling both the current driver and the previous driver.


Here's the start and end of the log around the one workunit that was working:

Code: Select all

18:17:40:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:11730 run:2 clone:79 gen:113 core:0x21 unit:0x0000009d8ca304e75bcbe510e5689119
18:17:40:WU01:FS00:Starting
18:17:40:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\David\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/Win32/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21.exe -dir 01 -suffix 01 -version 705 -lifeline 9256 -checkpoint 14 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
18:17:40:WU01:FS00:Started FahCore on PID 10116
18:17:40:WU01:FS00:Core PID:10160
18:17:40:WU01:FS00:FahCore 0x21 started
18:17:42:WU01:FS00:0x21:*********************** Log Started 2019-04-20T18:17:41Z ***********************
18:17:42:WU01:FS00:0x21:Project: 11730 (Run 2, Clone 79, Gen 113)
18:17:42:WU01:FS00:0x21:Unit: 0x0000009d8ca304e75bcbe510e5689119
18:17:42:WU01:FS00:0x21:CPU: 0x00000000000000000000000000000000
18:17:42:WU01:FS00:0x21:Machine: 0
18:17:42:WU01:FS00:0x21:Reading tar file core.xml
18:17:42:WU01:FS00:0x21:Reading tar file integrator.xml
18:17:42:WU01:FS00:0x21:Reading tar file state.xml
18:17:42:WU01:FS00:0x21:Reading tar file system.xml
18:17:42:WU01:FS00:0x21:Digital signatures verified
18:17:42:WU01:FS00:0x21:Folding@home GPU Core21 Folding@home Core
18:17:42:WU01:FS00:0x21:Version 0.0.18
18:17:55:WU01:FS00:0x21:Completed 0 out of 2500000 steps (0%)
18:17:55:WU01:FS00:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
18:24:05:ERROR:Receive error: 10054: An existing connection was forcibly closed by the remote host.
18:24:21:ERROR:Receive error: 10053: An established connection was aborted by the software in your host machine.
.
.
.

18:28:53:ERROR:Receive error: 10053: An established connection was aborted by the software in your host machine.
18:28:57:WU01:FS00:0x21:Completed 25000 out of 2500000 steps (1%)
.
.
.
04:43:46:WU01:FS00:0x21:Completed 1200000 out of 2500000 steps (48%)
05:02:09:WU01:FS00:0x21:Completed 1225000 out of 2500000 steps (49%)
05:15:33:FS00:Paused
05:15:33:FS00:Shutting core down
05:15:33:WU01:FS00:0x21:WARNING:Console control signal 1 on PID 10160
05:15:33:WU01:FS00:0x21:Exiting, please wait. . .
05:15:35:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
05:15:39:Removing old file 'configs/config-20190413-014952.xml'
05:15:39:Saving configuration to config.xml
05:15:39:<config>
05:15:39:  <!-- Folding Core -->
05:15:39:  <checkpoint v='14'/>
05:15:39:  <core-priority v='low'/>
05:15:39:
05:15:39:  <!-- Folding Slot Configuration -->
05:15:39:  <cause v='CANCER'/>
05:15:39:
05:15:39:  <!-- Network -->
05:15:39:  <proxy v=':8080'/>
05:15:39:
05:15:39:  <!-- Slot Control -->
05:15:39:  <power v='medium'/>
05:15:39:
05:15:39:  <!-- User Information -->
05:15:39:  <passkey v='********************************'/>
05:15:39:  <team v='11'/>
05:15:39:  <user v='Yavanius'/>
05:15:39:
05:15:39:  <!-- Folding Slots -->
05:15:39:  <slot id='0' type='GPU'>
05:15:39:    <paused v='true'/>
05:15:39:  </slot>
05:15:39:</config>
...
05:31:51:ERROR:Receive error: 10053: An established connection was aborted by the software in your host machine.
******************************* Date: 2019-04-21 *******************************
15:49:21:FS00:Unpaused
15:49:21:WU01:FS00:Starting
15:49:21:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\David\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/Win32/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21.exe -dir 01 -suffix 01 -version 705 -lifeline 9256 -checkpoint 14 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
15:49:21:WU01:FS00:Started FahCore on PID 7260
15:49:21:WU01:FS00:Core PID:3660
15:49:21:WU01:FS00:FahCore 0x21 started
15:49:22:WU01:FS00:0x21:*********************** Log Started 2019-04-21T15:49:22Z ***********************
15:49:22:WU01:FS00:0x21:Project: 11730 (Run 2, Clone 79, Gen 113)
15:49:22:WU01:FS00:0x21:Unit: 0x0000009d8ca304e75bcbe510e5689119
15:49:22:WU01:FS00:0x21:CPU: 0x00000000000000000000000000000000
15:49:22:WU01:FS00:0x21:Machine: 0
15:49:22:WU01:FS00:0x21:Digital signatures verified
15:49:22:WU01:FS00:0x21:Folding@home GPU Core21 Folding@home Core
15:49:22:WU01:FS00:0x21:Version 0.0.18
15:49:23:WU01:FS00:0x21:  Found a checkpoint file
15:49:30:WU01:FS00:0x21:ERROR:exception: Error initializing context: clGetDeviceInfo (-5)
15:49:30:WU01:FS00:0x21:Saving result file logfile_01.txt
15:49:30:WU01:FS00:0x21:Saving result file log.txt
15:49:30:WU01:FS00:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
15:49:30:WARNING:WU01:FS00:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
15:49:30:WU01:FS00:Sending unit results: id:01 state:SEND error:FAULTY project:11730 run:2 clone:79 gen:113 core:0x21 unit:0x0000009d8ca304e75bcbe510e5689119
Yavanius
Posts: 121
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Post by Yavanius »

There might be something to Folding thinking the GPU isn't active...


I did a quick experiment. I loaded some more BOINC work, made sure it was crunching the nVidia GPU, restarted the Folding unit and guess what... the Folding is running.

Sofar as I know, I don't think BOINC is responsible because there hasn't been a new version installed and I also tried shutting BOINC down (not just exiting the manager) even though it wasn't active.
foldy
Posts: 2061
Joined: Sat Dec 01, 2012 3:43 pm
Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441

Re: FAULTY project:13816

Post by foldy »

I guess FAH might have a bug and cannot switch the Laptop intel iGPU to nvidia gtx 980m. But BOINC can do it somehow. Maybe leave that laptop to BOINC.

Or it is some power saving? Try to prevent the gtx 980m to enter sleep, set Windows 10 power options to performance mode
Yavanius
Posts: 121
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Post by Yavanius »

foldy wrote:I guess FAH might have a bug and cannot switch the Laptop intel iGPU to nvidia gtx 980m. But BOINC can do it somehow. Maybe leave that laptop to BOINC.

Or it is some power saving? Try to prevent the gtx 980m to enter sleep, set Windows 10 power options to performance mode

Actually, the settings are customized by me. The only odd thing I saw was that PCI Express was set to Maximum Energy Savings when plugged in. I don't know if that accidentally got set that way or something changed it. I had to adjust before because with the default settiings the computer would go to sleep while I was doing BOINC. Dunno if that's still an issue but obviously I'm not terribly concerned about power savings when I'm crunching.

The whole thing though is the laptop was just fine (I've been Folding for months until recently when I rebooted to clear a phantom notification (which I tried troubleshooting to no resolve).

Yav
Yavanius
Posts: 121
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Post by Yavanius »

PCI Express setting doesn't seem to made any difference. I'm thinking it's that Microsoft update because there are chipset vulnerabilties addressed in it which I think might be screwing things up with Folding. Of course, last time I tried to uninstall it Windows snuck it back in on me which was weird because I had to reboot to uninstall it but I didn't see anything about rebooting when it apparently was reinstalled (assuming it actually was uninstalled). I'll have to try playing around with things again when I get a chance again...
Yavanius
Posts: 121
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Post by Yavanius »

Well, I didn't see the notice until yesterday, but apparently there was a new update released 4/23 - v430.39. Without looking the drivers up, I don't know the exact date of the previous one, but it seemed about a week or so. So maybe there was some issue with the previous driver because Folding is working again without the BOINC workaround (I wonder if something else using the GPU would have worked too??).

I'm just wondering if possibly Avast could have been a culprit because when I rebooted Avast apparently updated at the same time (got notified on Windows reload)...
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: FAULTY project:13816

Post by bruce »

Some problems have been reported for Avast (quite some time back) but you had it configured correctly before the update, I doubt it would have introduced a new problem. I'd suspect WindowsUpdate is more likely to have caused a problem with the drivers, but as long as it has been fixed, it doesn't really matter.
Yavanius
Posts: 121
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Post by Yavanius »

The one WU completed but I'm back to the same problem again. I didn't think it was Avast, but I tried temporarily turning it off without any change.

I can't recall if there were any BOINC nVidia WUs going on when I installed, so it's possible what I thought was a fix was just because BOINC was crunching and Folding started running when I restarted.

Unfortunately, I just don't have the time to currently delve into this more...

(On a different note, folks wanting an Intel GPU Folding client, there's not that many BOINC projects that support it. I know Einstein and of course SETI support it. Quite possibly the mathematics projects, but those don't interest me.)
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: FAULTY project:13816

Post by bruce »

FAH does not support Intel GPUs and I doubt they ever will. The Intel iGPU hardware is (1) extremely limited when used for 3D calculations -- the CPU can still fold and that's a reasonable alternative -- and (2) historically, Intel has had more than it's share of OpenCL driver issues.

FAH does support CPUs, though and GPUs from nVidia and AMD.

If I remember correctly, you have to configure AVAST so it does not scan FAH's work files in %APPDATA%/fahclient. It sometimes erases the work files because its heuristic code incorrectly diagnoses random values in data files as something that looks like a virus to them. Once the file is quarantined, the damage is done and suspending AVAST won't help FAH finish the WU.
Yavanius
Posts: 121
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Post by Yavanius »

I give up for now. I've been running Folding overnight and had rebooted my computer forgetting about Folding only to discover later I forgot about and all the work went poof. Rather disappointing the client hasn't been updated in some time. I was rather enjoying a nice climb stats-wise for a whiles. I just don't want to have to keep having to have BOINC start GPU computing (doesn't have to be BOINC, but it's overall the easiest way to get the nVidia GPU active) going to Folding to start work, going back to BOINC to pause it (else BOINC robs too much cycles, although I don't have an idea just how much offhand), let Folding finish, go back to BOINC to do the work... rinse, repeat, ad nauseum.

Although the stats was the fun side, crunching for Cancer (although there's no easy way to readily tell unless you memorize all the associated project #s) does mean a lot to the family as Breast Cancer has struck on the wife's side.


Oh, when you mentioned excluding the data directory, BOINC has had similar issues with antiviruses too. I don't remember specifically what antivirus, but it has been brought to attention repeatedly by users. Some antivirus companies are more responsive to these issues and others, well... more than one unhappy customer. I don't think it's an antivirus issue because if it was blocking Folding, it shouldn't work regardless of the workaround.

There's probably some kind of more permanent work-around but at the moment I don't have the time to go digging for it. Hopefully Pande Lab will get a new client out that will fix this and a number of other outstanding issues. If they ever want to hit a million...

If anybody has any other ideas, drop a reply and I'll check it out when I get a notification in e-mail.


Oh BTW Bruce, check out DreamLab for Android. They are based out of Australia and the app is sponsored by Vodafone. They are doing a couple of different Cancer related research on Android. Slick little app that Pande et al. should check out. Maybe the programmers could assist with a new Folding Android app. There's no leader boards, but it's something in the works.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: FAULTY project:13816

Post by bruce »

I don't know how the computer decides whether to enable the iGPU or the dGPU. FAH is certainly going to fail if the dGPU isn't available. There probably is some power saving settings that you MIGHT be able to set manually asking the iGPU to be permanently disabled (as you don't want to save power when FAH is running. I think a few BOINC projects can run on the iGPU. In general, hardware designed for battery powered devices (laptops, portable phones, etc.) are optimized to save power instead of being optimized to do serious calculations.
Yavanius
Posts: 121
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Post by Yavanius »

bruce wrote:I don't know how the computer decides whether to enable the iGPU or the dGPU. FAH is certainly going to fail if the dGPU isn't available. There probably is some power saving settings that you MIGHT be able to set manually asking the iGPU to be permanently disabled (as you don't want to save power when FAH is running. I think a few BOINC projects can run on the iGPU. In general, hardware designed for battery powered devices (laptops, portable phones, etc.) are optimized to save power instead of being optimized to do serious calculations.
.
Well it's an ASUS ROG, so it's intended for performance.

That said, in terms of Windows, I did double-check the power settings. There was only one thing that could have possibly been going on power savings that somehow I missed (possibly I accidentally hit or didn't change the power savings setting instead of the performance setting). It didn't do anything.

Folding just isn't able to communicate with the nVidia chip anymore. I tried falling back to an earlier driver version and there's been a few new updated drivers since I reported in, nothing of which has made a difference which is making me thinking that Microsoft security update did something that now is interfering with how Folding communicates with the nVidia chip, possibly something kind of system call no longer allowed. Wouldn't be the first time and MS is notorious for breaking things like that.



Incidentally, following up on an earlier note, I looked and found there's only 3 BOINC projects that have active support (there's XANSONS too, but they only issue work now about once a month for new structures) for Intel GPU: SETI, Einstein, & Collatz Conjecture. Of course SETI & Einstein are veteran projects with the support to code for it. Off-hand, I don't recall how they did the coding for the Intel GPUs. Intel GPU support is relatively new and I personally didn't have a system that supported it hardware-wise until this ASUS so I haven't followed it that close. There's requests for it, but a lot of projects only have 1 maybe 2 folks doing active coding and they often they are the admins too and have actual duties outside the projects. Outside BOINC, it's possible bitcoin type mining projects might utilize it.

BTW, Folding it's easier to do only GPU. BOINC client doesn't have a setting to turn off the CPU and leave the GPU on (at least without editing the ini files). You can do it on a PER PROJECT basis, although that means you have to do it everytime you add a new new project with GPU clients...

~Yav
Yavanius
Posts: 121
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Post by Yavanius »

Just out of curiosity, I downloaded the 7.4.4 client just to see what happens. Well...nothing happened. The core client wouldn't initialize even with a reboot. I can't recall if I was actually running that version on this system or I'm thinking my old Dell Latitude.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: FAULTY project:13816

Post by bruce »

@Yavanius,

I do not see the first two pages of FAH's Log in any of your posts. Follow the instructions here:
viewtopic.php?f=24&t=26036
Yavanius
Posts: 121
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Post by Yavanius »

Oh, I wasn't even thinking about that Bruce. :) I just wanted to see if I could crunch on the older client without the workaround, but I'll get a log later this week sometime up here.

Thanks and have a good week,

~Yav
Post Reply