proj 6805 R 3626 c3 g14 - NANs detected on GPU

Moderators: Site Moderators, FAHC Science Team

Post Reply
Xavier Zepherious
Posts: 140
Joined: Fri Jan 21, 2011 8:02 am

proj 6805 R 3626 c3 g14 - NANs detected on GPU

Post by Xavier Zepherious »

[21:00:34] Project: 6805 (Run 3626, Clone 3, Gen 14)
[21:00:34]
[21:00:34] Assembly optimizations on if available.
[21:00:34] Entering M.D.
[21:00:36] Tpr hash work/wudata_03.tpr: 2009117313 622594832 3856663995 3903849362 195187923
[21:00:36] Working on ALZHEIMER'S DISEASE AMYLOID
[21:00:36] Client config found, loading data.
[21:00:36] Starting GUI Server
[21:00:36] Setting checkpoint frequency: 500000
[21:00:36] Setting checkpoint frequency: 500000
[21:01:50] Completed 500000 out of 50000000 steps (1%).
[21:03:04] Completed 1000000 out of 50000000 steps (2%).
[21:04:18] Completed 1500000 out of 50000000 steps (3%).
[21:05:33] Completed 2000000 out of 50000000 steps (4%).
[21:06:47] Completed 2500000 out of 50000000 steps (5%).
[21:08:01] Completed 3000000 out of 50000000 steps (6%).
[21:09:15] Completed 3500000 out of 50000000 steps (7%).
[21:10:31] Completed 4000000 out of 50000000 steps (8%).
[21:11:46] Completed 4500000 out of 50000000 steps (9%).
[21:13:01] Completed 5000000 out of 50000000 steps (10%).
[21:14:16] Completed 5500000 out of 50000000 steps (11%).
[21:15:30] Completed 6000000 out of 50000000 steps (12%).
[21:16:45] Completed 6500000 out of 50000000 steps (13%).
[21:18:00] Completed 7000000 out of 50000000 steps (14%).
[21:18:00] mdrun_gpu returned 52
[21:18:00] NANs detected on GPU
[21:18:00]
[21:18:00] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:18:03] CoreStatus = 7A (122)
[21:18:03] Sending work to server
[21:18:03] Project: 6805 (Run 3626, Clone 3, Gen 14)
[21:18:03] - Error: Could not get length of results file work/wuresults_03.dat
[21:18:03] - Error: Could not read unit 03 file. Removing from queue.
[21:18:03] - Preparing to get new work unit...
[21:18:03] Cleaning up work directory


Another one on the same project -

had Nana error with another 6805 before
just did 9 in a row with no issues till this one
and the WU one after this one works fine
HendricksSA
Posts: 339
Joined: Fri Jun 26, 2009 4:34 am

Re: proj 6805 R 3626 c3 g14 - NANs detected on GPU

Post by HendricksSA »

Xavier, was this the same GPU as viewtopic.php?f=19&t=17852&p=177951&hilit=returned+52#p177951

This might signal a problem with the GPU.
Xavier Zepherious
Posts: 140
Joined: Fri Jan 21, 2011 8:02 am

Re: proj 6805 R 3626 c3 g14 - NANs detected on GPU

Post by Xavier Zepherious »

yes it is
been folding with no issues for 3 months
mind you it's been up 24/7 for last week...

it did 9(6 of them 6805 projects) after the first fail
and it worked the next unit after this one fine

I think it just needs a quick reboot - to shake some cobwebs out
It's only when I run the card long - for a week I have issues

I have seen issues with fermis on card issues when running 24/7 for long extended periods in NVidia Forums
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: proj 6805 R 3626 c3 g14 - NANs detected on GPU

Post by PantherX »

No data i the WU Database yet so I have marked it for a follow-up.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
HendricksSA
Posts: 339
Joined: Fri Jun 26, 2009 4:34 am

Re: proj 6805 R 3626 c3 g14 - NANs detected on GPU

Post by HendricksSA »

I know my XP machines get flakey if they run for too long. 7 and Vista no problem but XP another story. All of them run SMP. Let us know how it goes.
Xavier Zepherious
Posts: 140
Joined: Fri Jan 21, 2011 8:02 am

Re: proj 6805 R 3626 c3 g14 - NANs detected on GPU

Post by Xavier Zepherious »

I'll know more in a day or two, but it could have been unstable... the reboot has the card folding 100% no errors and my production has improved
Nathan_P
Posts: 1165
Joined: Wed Apr 01, 2009 9:22 pm
Hardware configuration: Asus Z8NA D6C, 2 x5670@3.2 Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)

Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS

Not currently folding
Asus Z9PE- D8 WS, 2 E5-2665@2.3 Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only)
Location: Jersey, Channel islands

Re: proj 6805 R 3626 c3 g14 - NANs detected on GPU

Post by Nathan_P »

HendricksSA wrote:I know my XP machines get flakey if they run for too long. 7 and Vista no problem but XP another story. All of them run SMP. Let us know how it goes.
Must be a problem somewhere, my XP GPU rig stayed up for weeks at a time, only being shut down for a heatsink clean or upgrade. my current main use rig folds 24/7, games etc and it has no issues either - and this is a botched reinstall over several other reinstalls
Image
Xavier Zepherious
Posts: 140
Joined: Fri Jan 21, 2011 8:02 am

Re: proj 6805 R 3626 c3 g14 - NANs detected on GPU

Post by Xavier Zepherious »

It's stable now 3 days with no errors...all I did was a reboot
PantherX
Site Moderator
Posts: 6986
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: proj 6805 R 3626 c3 g14 - NANs detected on GPU

Post by PantherX »

The WU has been completed by another donor:
Your WU (P6805 R3626 C3 G14) was added to the stats database on 2011-03-15 11:07:36 for 1280 points of credit.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Post Reply