New GPU projects 5765-5772 EUE's?

Razor_FX_II

[H]ard|DCer of the Month - January 2009
Joined
Jun 23, 2008
Messages
1,156
EUE was on Project: 5766 (Run 2, Clone 63, Gen 0) running Vista32, 178.24 drivers, 8800 GTS (G92) 512mb (very stable).
Code:
[FONT=Courier New][13:26:06] Folding@Home GPU Core - Beta[/FONT]
[FONT=Courier New][13:26:06] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)[/FONT]
[FONT=Courier New][13:26:06] [/FONT]
[FONT=Courier New][13:26:06] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 [/FONT]
[FONT=Courier New][13:26:06] Build host: amoeba[/FONT]
[FONT=Courier New][13:26:06] Board Type: Nvidia[/FONT]
[FONT=Courier New][13:26:06] Core      : [/FONT]
[FONT=Courier New][13:26:06] Preparing to commence simulation[/FONT]
[FONT=Courier New][13:26:06] - Looking at optimizations...[/FONT]
[FONT=Courier New][13:26:06] - Created dyn[/FONT]
[FONT=Courier New][13:26:06] - Files status OK[/FONT]
[FONT=Courier New][13:26:06] - Expanded 43942 -> 252912 (decompressed 575.5 percent)[/FONT]
[FONT=Courier New][13:26:06] Called DecompressByteArray: compressed_data_size=43942 data_size=252912, decompressed_data_size=252912 diff=0[/FONT]
[FONT=Courier New][13:26:06] - Digital signature verified[/FONT]
[FONT=Courier New][13:26:06] [/FONT]
[FONT=Courier New][13:26:06] Project: 5766 (Run 2, Clone 63, Gen 0)[/FONT]
[FONT=Courier New][13:26:06] [/FONT]
[FONT=Courier New][13:26:06] Assembly optimizations on if available.[/FONT]
[FONT=Courier New][13:26:06] Entering M.D.[/FONT]
[FONT=Courier New][13:26:13] Working on Protein[/FONT]
[FONT=Courier New][13:26:13] Client config found, loading data.[/FONT]
[FONT=Courier New][13:26:13] mdrun_gpu returned [/FONT]
[FONT=Courier New][13:26:13] NANs detected on GPU[/FONT]
[FONT=Courier New][13:26:13] [/FONT]
[FONT=Courier New][13:26:13] Folding@home Core Shutdown: UNSTABLE_MACHINE[/FONT]
[FONT=Courier New][13:26:17] CoreStatus = 7A (122)[/FONT]
[FONT=Courier New][13:26:17] Sending work to server[/FONT]
[FONT=Courier New][13:26:17] Project: 5766 (Run 2, Clone 63, Gen 0)[/FONT]
[FONT=Courier New][13:26:17] - Read packet limit of 540015616... Set to 524286976.[/FONT]
[FONT=Courier New][13:26:17] - Error: Could not get length of results file work/wuresults_03.dat[/FONT]
[FONT=Courier New][13:26:17] - Error: Could not read unit 03 file. Removing from queue.[/FONT]
[FONT=Courier New][13:26:17] EUE limit exceeded. Pausing 24 hours.[/FONT]
[FONT=Courier New][13:56:05] - Autosending finished units... [December 24 13:56:05 UTC][/FONT]
[FONT=Courier New][13:56:05] Trying to send all finished work units[/FONT]
[FONT=Courier New][13:56:05] + No unsent completed units remaining.[/FONT]
[FONT=Courier New][13:56:05] - Autosend completed[/FONT]

EUE was on Project: 5768 (Run 8, Clone 47, Gen 8) running Vista64, 181.00 drivers, GTX 260 (very stable).
Code:
[FONT=Courier New][21:27:39] [/FONT]
[FONT=Courier New][21:27:39] *------------------------------*[/FONT]
[FONT=Courier New][21:27:39] Folding@Home GPU Core - Beta[/FONT]
[FONT=Courier New][21:27:39] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)[/FONT]
[FONT=Courier New][21:27:39] [/FONT]
[FONT=Courier New][21:27:39] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 [/FONT]
[FONT=Courier New][21:27:39] Build host: amoeba[/FONT]
[FONT=Courier New][21:27:39] Board Type: Nvidia[/FONT]
[FONT=Courier New][21:27:39] Core      : [/FONT]
[FONT=Courier New][21:27:39] Preparing to commence simulation[/FONT]
[FONT=Courier New][21:27:39] - Looking at optimizations...[/FONT]
[FONT=Courier New][21:27:39] - Created dyn[/FONT]
[FONT=Courier New][21:27:39] - Files status OK[/FONT]
[FONT=Courier New][21:27:39] - Expanded 46678 -> 252912 (decompressed 541.8 percent)[/FONT]
[FONT=Courier New][21:27:39] Called DecompressByteArray: compressed_data_size=46678 data_size=252912, decompressed_data_size=252912 diff=0[/FONT]
[FONT=Courier New][21:27:39] - Digital signature verified[/FONT]
[FONT=Courier New][21:27:39] [/FONT]
[FONT=Courier New][21:27:39] Project: 5768 (Run 8, Clone 47, Gen 8)[/FONT]
[FONT=Courier New][21:27:39] [/FONT]
[FONT=Courier New][21:27:39] Assembly optimizations on if available.[/FONT]
[FONT=Courier New][21:27:39] Entering M.D.[/FONT]
[FONT=Courier New][21:27:46] Working on Protein[/FONT]
[FONT=Courier New][21:27:46] Client config found, loading data.[/FONT]
[FONT=Courier New][21:27:46] mdrun_gpu returned [/FONT]
[FONT=Courier New][21:27:46] NANs detected on GPU[/FONT]
[FONT=Courier New][21:27:46] [/FONT]
[FONT=Courier New][21:27:46] Folding@home Core Shutdown: UNSTABLE_MACHINE[/FONT]
[FONT=Courier New][21:27:49] CoreStatus = 7A (122)[/FONT]
[FONT=Courier New][21:27:49] Sending work to server[/FONT]
[FONT=Courier New][21:27:49] Project: 5768 (Run 8, Clone 47, Gen 8)[/FONT]
[FONT=Courier New][21:27:49] - Read packet limit of 540015616... Set to 524286976.[/FONT]
[FONT=Courier New][21:27:49] - Error: Could not get length of results file work/wuresults_04.dat[/FONT]
[FONT=Courier New][21:27:49] - Error: Could not read unit 04 file. Removing from queue.[/FONT]
[FONT=Courier New][21:27:49] EUE limit exceeded. Pausing 24 hours.[/FONT]

Have had 2 other EUE's with the new work units on the GTX 260's, sorry dont have the logs on those.
Anyone else having problems?
 
Got one crunching now (R3,C35,G2) that's almost finished (90%). This is on a 9800GX2.
 
I have completed many of them but a few have just failed from the beginning with a NANs detected on GPU error.
 
yeah ive had a few of those start get to about 3% and eue.. but ive had more work then fail.. so i dunno.. its pretty much hit and miss..


*edit* in reply to the post below mine.. im using 180.84...
 
I've completed quite a few with no EUEs on my 280s.


P.S. I'm using the 178.24 drivers.
 
About 4k ppd on 9600GT (G94) 512mb not sure on the others there doing other things or EUEing.

EUE on Project: 5768 (Run 8, Clone 9, Gen 9) running Vista32, 181.00 drivers, 8800 GTS (G92) 512mb (very stable).
Code:
[00:38:40] 
[00:38:40] *------------------------------*
[00:38:40] [EMAIL="Folding@Home"]Folding@Home[/EMAIL] GPU Core - Beta
[00:38:40] Version 1.19 (Mon Nov 3 09:34:13 PST 2008)
[00:38:40] 
[00:38:40] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[00:38:40] Build host: amoeba
[00:38:40] Board Type: Nvidia
[00:38:40] Core      : 
[00:38:40] Preparing to commence simulation
[00:38:40] - Looking at optimizations...
[00:38:40] - Created dyn
[00:38:40] - Files status OK
[00:38:40] - Expanded 46616 -> 252912 (decompressed 542.5 percent)
[00:38:40] Called DecompressByteArray: compressed_data_size=46616 data_size=252912, decompressed_data_size=252912 diff=0
[00:38:40] - Digital signature verified
[00:38:40] 
[00:38:40] Project: 5768 (Run 8, Clone 9, Gen 9)
[00:38:40] 
[00:38:40] Assembly optimizations on if available.
[00:38:40] Entering M.D.
[00:38:46] Working on Protein
[00:38:47] Client config found, loading data.
[00:38:47] mdrun_gpu returned 
[00:38:47] NANs detected on GPU
[00:38:47] 
[00:38:47] [EMAIL="Folding@home"]Folding@home[/EMAIL] Core Shutdown: UNSTABLE_MACHINE
[00:38:51] CoreStatus = 7A (122)
[00:38:51] Sending work to server
[00:38:51] Project: 5768 (Run 8, Clone 9, Gen 9)
[00:38:51] - Read packet limit of 540015616... Set to 524286976.
[00:38:51] - Error: Could not get length of results file work/wuresults_01.dat
[00:38:51] - Error: Could not read unit 01 file. Removing from queue.
[00:38:51] EUE limit exceeded. Pausing 24 hours.
 
The new 353 pointers give me around 6k-7k on my 9800GTX's/9800GTX+'s/9800GX2's.
The old 384 pointers give me around 5k-6k on my 9800GTX's/9800GTX+'s/9800GX2's.
The slow 511 pointers give me around 3.5k-4k on my 9800GTX's/9800GTX+'s/9800GX2's.

I like these new protiens.

Luck ............ :D
 
The new 353 pointers give me around 6k-7k on my 9800GTX's/9800GTX+'s/9800GX2's.
The old 384 pointers give me around 5k-6k on my 9800GTX's/9800GTX+'s/9800GX2's.
The slow 511 pointers give me around 3.5k-4k on my 9800GTX's/9800GTX+'s/9800GX2's.

I like these new protiens.

Luck ............ :D

As long as they offer a mix, I'm golden.

 
I have a 5766 on my 8800GT right now over 90% done. 6300PPD, 50 sec. frames.
685/990/1795, 53c, Duorb.
Also have a 5769 on the 260GTX at 62% done. 8000PPD, 38 sec. frames.
648/1026/1540, 63c, OEM Cooling, 100% fan.
All on 180.48 drivers. I tried the 180.84 betas on my main rig, resulting in BSOD at the desktop with in sec's of boot.

I hope this works out for you Razor. I know how damned frustrating it can get.
 
I'm not really frustrated, more concerned as to why I seem to be one of the few that is having some errors with these units.
The nice thing is that it happens right from the start of the unit instead of half way through or at the end.
Just thought my baby sitting days were over, but I guess I better think again.
They / I will eventually get it figured out. Thanks for the replies.
 
Oh I thought that was some kind of counter. XD

5757
Code:
14:07:37] - Error: Could not get length of results file work/wuresults_09.dat
14:07:37] - Error: Could not read unit 09 file. Removing from queue.

Every other:
Code:
[13:45:43] Run: exception thrown during GuardedRun
[13:45:43] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.

And EUE after 8 hours every time.
 
I also am having similar issues with 5766's. Starts detects nan's then eue 24hours. This is happening on one computer only with 2 9600 GSO's. Vista 64 and continues to do so even after removing all overclocking except fan lol. I have deleted work directories and the normal files in main directory to no avail. It may run one work unit then eue or three, but sometime during a 24 hour period one or both will eue. Yes, the babysitting is a pita. Running 178.42 drivers. Hopefully this behavior with calm down soon, been doing this since the faster 57xx series came down the pike. Just wan't to let Razor know he is not alone. (comfort)

 
Thanks for that Hgradio.
I have noticed on my systems its always the first card (top card, next to north bridge) that is the card that EUE's.
I have Even changed out the top card with one that wasn't having problems and then it starts EUE'ing and the one I removed and put in a different system stops EUE'ing...weird
 
I'm also getting EUEs on one card, but this didn't start happening until I rebooted the system. I think it might be related to something entirely different since the WU was more than half-way through without issue before I rebooted. Don't understand what may have happened afterwards. If I suspect it has something to do with the new WUs I will post about it. Same thing if I notice my other cards/systems begin to give me issues. Whatever it is, I hope it passes for all concerned.

/crosses fingers
 
I've run a number of the 5766s on my overclocked 8800GT and haven't had any trouble with them that I've noticed.

From the look of things, it seems like many of the people getting EUEs with them are GTX260s or GTX280s. One other factor could be the run/clone/gen. I don't now which ones I have had so I can't post what they are, but I've seen more than once where only certain run/clone/gens have been unstable. This could be the case here.

 
Hgradio, Razor, You're not alone. My 260gt/Vista32 has been like this for quite some time. So much in fact that I grow tired of babysitting the thing and stop folding . No amount f tweaking GPU/Fan/lack of tweaking helps.

I'm skeptical it's my hardware. When I get one, I can close the GPU, wait 5 minutes, re-start and it runs to completion without incident. Crap shoot as to what happens nest. Sometimes it chews up WU's without issue, other times it EUE's right out of the box.

Uninstall, re-install, delete Work folder and Queue, been there, done that and still suffering. PPD is like crap this year because of this.
 
Hgradio, Razor, You're not alone. My 260gt/Vista32 has been like this for quite some time. So much in fact that I grow tired of babysitting the thing and stop folding . No amount f tweaking GPU/Fan/lack of tweaking helps.

I'm skeptical it's my hardware. When I get one, I can close the GPU, wait 5 minutes, re-start and it runs to completion without incident. Crap shoot as to what happens nest. Sometimes it chews up WU's without issue, other times it EUE's right out of the box.

Uninstall, re-install, delete Work folder and Queue, been there, done that and still suffering. PPD is like crap this year because of this.

Same problem here.
 
Hgradio, Razor, You're not alone. My 260gt/Vista32 has been like this for quite some time. So much in fact that I grow tired of babysitting the thing and stop folding . No amount f tweaking GPU/Fan/lack of tweaking helps.

I'm skeptical it's my hardware. When I get one, I can close the GPU, wait 5 minutes, re-start and it runs to completion without incident. Crap shoot as to what happens nest. Sometimes it chews up WU's without issue, other times it EUE's right out of the box.

Uninstall, re-install, delete Work folder and Queue, been there, done that and still suffering. PPD is like crap this year because of this.

I too am still having the same problem. I have changed my CPU oc, my shaders, my core, my memory. I have set everything back to stock just to see what would happen and still it doesn't fix things. What could be causing it? You will see on the weekends my PPD goes up considerably because I can restart folding every few hours if I get one of these. However, often I get home from work just to find out that I EUE'ed while I was on my way out the door and nothing computed all day.

Not to sound too much like a complainer, but I never really bitched much about SMP. I had questions but didn't bitch. This seems like a situation where there are quite a few people having problems but not enough for them to really really try on. And I know it is frustrating, and they might be trying, but I certainly still can't leave my GPU alone for weeks at a time like I used to.

Xil, is there any news on this? Is it my power supply? :rolleyes:

 
I'm running 17 GPU clients on my various 9800 GTX/GX2 cards.
They are all shader overclocked, GX2's to 1728 Mhz, GTX's to 1890 Mhz, GTX+'s to 1944 Mhz.
I'm getting less that 1 per 1,000 WU EUE'ing, so I call them stable on my boxen.

One possibility is as the 353 pointers stress the cards a bit more, something like the VRM's are overheating.
Can you run the FurMark stress test on your card, as that will stress your cards more than folding will.

Luck ............. :D
 
I'm running 17 GPU clients on my various 9800 GTX/GX2 cards.
They are all shader overclocked, GX2's to 1728 Mhz, GTX's to 1890 Mhz, GTX+'s to 1944 Mhz.
I'm getting less that 1 per 1,000 WU EUE'ing, so I call them stable on my boxen.

One possibility is as the 353 pointers stress the cards a bit more, something like the VRM's are overheating.
Can you run the FurMark stress test on your card, as that will stress your cards more than folding will.

Hey thanks for the reply Tiger. It has been a while since I used that program. When I first got this card I used it to set up my initial overclock. I actually set my card up to its highest settings in Furmark and then took the shaders down a notch.

Could something have happened that would have caused my card to need a re-testing? I guess as another effort I can re-do that just to see if something has changed somehow.

I have not changed drivers and I am not even sure what drivers I am running. I am at work right now, but I really do want to address this. Before the problems started I was really close to 10k ppd. Now there are some days where I am lucky to get 3k, much of which comes from SMP.

Tonight I will try to work on it some.
 
I have zero problems with the GPU2 clients for several weeks. If you are still having problems, I suggest you wipe the boxen and do a fresh install with the latest driver and client. Like that, we can be sure to remove any possible cause.
 
I have zero problems with the GPU2 clients for several weeks. If you are still having problems, I suggest you wipe the boxen and do a fresh install with the latest driver and client. Like that, we can be sure to remove any possible cause.

Thanks. I will try that too.

As far as drivers go, I have 2 questions:

1) What drivers are you using right now for your GTX 260s?
2) There is quite a bit of debate about driver cleaners. Do you use them?
 
If you are still having problems, I suggest you wipe the boxen and do a fresh install with the latest driver and client. Like that, we can be sure to remove any possible cause.

A bit Drakonian. I don't dispute the purpose, but have done this once prior to same resolve. While I want to Fold to help others, I use this PC for other work (my job stuff) as well as family oriented purposes. I'm not that motivated to reisntall all the apps/proggies for the sake of Folding.

This is where I draw the line. If Folding does not find my PC sufficient for it's use, even though other apps run without issue and no fears of BSOD's, then perhaps I should stop folding (on this PC) until things "get better".

+++

Tried the FurMark test and ran for hours without issues.
BTW: Once the (GPU)WU starts crunching, it runs to completion. I would think the power subsystem would freak (maybe) mid way rather than at the end of the WU. 1-5 minutes after closing the GPU, it folds off to the end when the cycle begins again.
 
Wheresatom, my secondary box is running the 178.24's Vista 64 bit on 2 GTX280s.
My main is running the 182.06's Vista 64 on 2 GTX280s and a 285. The 178.24s give higher PPD than any of the newer versions released after them...
 
How is it a stability problem when the EUE's happen at the start of a new WU, if it does not EUE then it folds fine?
 
So now that I've pissed and moaned about EUE's being the norm, my uptime is pretty solid... Haven't changed a thing. Go figure..
 
Back
Top