proj 6805 R 3626 c3 g14 - NANs detected on GPU
Moderators: Site Moderators, FAHC Science Team
-
- Posts: 140
- Joined: Fri Jan 21, 2011 8:02 am
proj 6805 R 3626 c3 g14 - NANs detected on GPU
[21:00:34] Project: 6805 (Run 3626, Clone 3, Gen 14)
[21:00:34]
[21:00:34] Assembly optimizations on if available.
[21:00:34] Entering M.D.
[21:00:36] Tpr hash work/wudata_03.tpr: 2009117313 622594832 3856663995 3903849362 195187923
[21:00:36] Working on ALZHEIMER'S DISEASE AMYLOID
[21:00:36] Client config found, loading data.
[21:00:36] Starting GUI Server
[21:00:36] Setting checkpoint frequency: 500000
[21:00:36] Setting checkpoint frequency: 500000
[21:01:50] Completed 500000 out of 50000000 steps (1%).
[21:03:04] Completed 1000000 out of 50000000 steps (2%).
[21:04:18] Completed 1500000 out of 50000000 steps (3%).
[21:05:33] Completed 2000000 out of 50000000 steps (4%).
[21:06:47] Completed 2500000 out of 50000000 steps (5%).
[21:08:01] Completed 3000000 out of 50000000 steps (6%).
[21:09:15] Completed 3500000 out of 50000000 steps (7%).
[21:10:31] Completed 4000000 out of 50000000 steps (8%).
[21:11:46] Completed 4500000 out of 50000000 steps (9%).
[21:13:01] Completed 5000000 out of 50000000 steps (10%).
[21:14:16] Completed 5500000 out of 50000000 steps (11%).
[21:15:30] Completed 6000000 out of 50000000 steps (12%).
[21:16:45] Completed 6500000 out of 50000000 steps (13%).
[21:18:00] Completed 7000000 out of 50000000 steps (14%).
[21:18:00] mdrun_gpu returned 52
[21:18:00] NANs detected on GPU
[21:18:00]
[21:18:00] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:18:03] CoreStatus = 7A (122)
[21:18:03] Sending work to server
[21:18:03] Project: 6805 (Run 3626, Clone 3, Gen 14)
[21:18:03] - Error: Could not get length of results file work/wuresults_03.dat
[21:18:03] - Error: Could not read unit 03 file. Removing from queue.
[21:18:03] - Preparing to get new work unit...
[21:18:03] Cleaning up work directory
Another one on the same project -
had Nana error with another 6805 before
just did 9 in a row with no issues till this one
and the WU one after this one works fine
[21:00:34]
[21:00:34] Assembly optimizations on if available.
[21:00:34] Entering M.D.
[21:00:36] Tpr hash work/wudata_03.tpr: 2009117313 622594832 3856663995 3903849362 195187923
[21:00:36] Working on ALZHEIMER'S DISEASE AMYLOID
[21:00:36] Client config found, loading data.
[21:00:36] Starting GUI Server
[21:00:36] Setting checkpoint frequency: 500000
[21:00:36] Setting checkpoint frequency: 500000
[21:01:50] Completed 500000 out of 50000000 steps (1%).
[21:03:04] Completed 1000000 out of 50000000 steps (2%).
[21:04:18] Completed 1500000 out of 50000000 steps (3%).
[21:05:33] Completed 2000000 out of 50000000 steps (4%).
[21:06:47] Completed 2500000 out of 50000000 steps (5%).
[21:08:01] Completed 3000000 out of 50000000 steps (6%).
[21:09:15] Completed 3500000 out of 50000000 steps (7%).
[21:10:31] Completed 4000000 out of 50000000 steps (8%).
[21:11:46] Completed 4500000 out of 50000000 steps (9%).
[21:13:01] Completed 5000000 out of 50000000 steps (10%).
[21:14:16] Completed 5500000 out of 50000000 steps (11%).
[21:15:30] Completed 6000000 out of 50000000 steps (12%).
[21:16:45] Completed 6500000 out of 50000000 steps (13%).
[21:18:00] Completed 7000000 out of 50000000 steps (14%).
[21:18:00] mdrun_gpu returned 52
[21:18:00] NANs detected on GPU
[21:18:00]
[21:18:00] Folding@home Core Shutdown: UNSTABLE_MACHINE
[21:18:03] CoreStatus = 7A (122)
[21:18:03] Sending work to server
[21:18:03] Project: 6805 (Run 3626, Clone 3, Gen 14)
[21:18:03] - Error: Could not get length of results file work/wuresults_03.dat
[21:18:03] - Error: Could not read unit 03 file. Removing from queue.
[21:18:03] - Preparing to get new work unit...
[21:18:03] Cleaning up work directory
Another one on the same project -
had Nana error with another 6805 before
just did 9 in a row with no issues till this one
and the WU one after this one works fine
-
- Posts: 339
- Joined: Fri Jun 26, 2009 4:34 am
Re: proj 6805 R 3626 c3 g14 - NANs detected on GPU
Xavier, was this the same GPU as viewtopic.php?f=19&t=17852&p=177951&hilit=returned+52#p177951
This might signal a problem with the GPU.
This might signal a problem with the GPU.
-
- Posts: 140
- Joined: Fri Jan 21, 2011 8:02 am
Re: proj 6805 R 3626 c3 g14 - NANs detected on GPU
yes it is
been folding with no issues for 3 months
mind you it's been up 24/7 for last week...
it did 9(6 of them 6805 projects) after the first fail
and it worked the next unit after this one fine
I think it just needs a quick reboot - to shake some cobwebs out
It's only when I run the card long - for a week I have issues
I have seen issues with fermis on card issues when running 24/7 for long extended periods in NVidia Forums
been folding with no issues for 3 months
mind you it's been up 24/7 for last week...
it did 9(6 of them 6805 projects) after the first fail
and it worked the next unit after this one fine
I think it just needs a quick reboot - to shake some cobwebs out
It's only when I run the card long - for a week I have issues
I have seen issues with fermis on card issues when running 24/7 for long extended periods in NVidia Forums
-
- Site Moderator
- Posts: 6986
- Joined: Wed Dec 23, 2009 9:33 am
- Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB
Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400 - Location: Land Of The Long White Cloud
- Contact:
Re: proj 6805 R 3626 c3 g14 - NANs detected on GPU
No data i the WU Database yet so I have marked it for a follow-up.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
-
- Posts: 339
- Joined: Fri Jun 26, 2009 4:34 am
Re: proj 6805 R 3626 c3 g14 - NANs detected on GPU
I know my XP machines get flakey if they run for too long. 7 and Vista no problem but XP another story. All of them run SMP. Let us know how it goes.
-
- Posts: 140
- Joined: Fri Jan 21, 2011 8:02 am
Re: proj 6805 R 3626 c3 g14 - NANs detected on GPU
I'll know more in a day or two, but it could have been unstable... the reboot has the card folding 100% no errors and my production has improved
-
- Posts: 1165
- Joined: Wed Apr 01, 2009 9:22 pm
- Hardware configuration: Asus Z8NA D6C, 2 x5670@3.2 Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)
Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS
Not currently folding
Asus Z9PE- D8 WS, 2 E5-2665@2.3 Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only) - Location: Jersey, Channel islands
Re: proj 6805 R 3626 c3 g14 - NANs detected on GPU
Must be a problem somewhere, my XP GPU rig stayed up for weeks at a time, only being shut down for a heatsink clean or upgrade. my current main use rig folds 24/7, games etc and it has no issues either - and this is a botched reinstall over several other reinstallsHendricksSA wrote:I know my XP machines get flakey if they run for too long. 7 and Vista no problem but XP another story. All of them run SMP. Let us know how it goes.
-
- Posts: 140
- Joined: Fri Jan 21, 2011 8:02 am
Re: proj 6805 R 3626 c3 g14 - NANs detected on GPU
It's stable now 3 days with no errors...all I did was a reboot
-
- Site Moderator
- Posts: 6986
- Joined: Wed Dec 23, 2009 9:33 am
- Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB
Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400 - Location: Land Of The Long White Cloud
- Contact:
Re: proj 6805 R 3626 c3 g14 - NANs detected on GPU
The WU has been completed by another donor:
Your WU (P6805 R3626 C3 G14) was added to the stats database on 2011-03-15 11:07:36 for 1280 points of credit.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time
Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues