Strange GTX 295 slowdowns

APOLLO

[H]ard|DCer of the Month - March 2009
Joined
Sep 17, 2000
Messages
9,089
I have a GTX 295 card that has been experiencing slowdowns and crashes on one GPU no matter what frequency I set the card at. Even at stock speeds one client will be processing at slower speeds, sometimes more than 50% slower than the second GPU or it fluctuates up and down wildly. What could be causing this? MSI AF reports full GPU usage with exactly the same clock frequencies for both GPUs, and even very close to the same temps (1-2[FONT=&quot]°[/FONT]C difference). It is counterintuitive to have identical operating parameters and conditions but to see half the performance. WTH is going on? Is the card breathing its last?? :confused:
 
sound slike its its dieing
OK, it's exactly what crossed my mind but if I use the GPU3 client on GPU#1 of this card, there are no slowdowns and no wild fluctuations in performance. The card processes GPU3 WUs with stability and consistency. It makes no freaking sense. I have used the card since the beginning of fall with no issues.
 
Is this a dedicated folder or a normal use box?

One of my mixed bigadv/GPU rigs recently got so fouled up in the CPU affinity department that both of my GPUs were producing nearly exactly 50% of their normal PPD. If I killed the bigadv client the GPUs quickly resumed 100% PPD.

I moved those GPUs to a different box (something I had wanted to do anyway) and applied a fresh coat of Windows paint and the GPUs are back up to 100%.
 
Is this a dedicated folder or a normal use box?
Dedicated.

One of my mixed bigadv/GPU rigs recently got so fouled up in the CPU affinity department that both of my GPUs were producing nearly exactly 50% of their normal PPD. If I killed the bigadv client the GPUs quickly resumed 100% PPD.
Since this is a -bigadv box that also crossed my mind. Why this should be happening now after all this time is beyond me. And why this isn't affecting the card if I switch the problematic GPU core over to the GPU3 client is also weird. Worse comes to worst, I could always leave the card with a mix of GPU2/GPU3 clients and just take the 2000 PPD hit.

I moved those GPUs to a different box (something I had wanted to do anyway) and applied a fresh coat of Windows paint and the GPUs are back up to 100%.
I don't have that option. BTW, 7im recommends never to use an affinity utility and only adjust priority levels in the client itself, because the clients apparently have hidden settings for all Windows priority levels. Maybe I should give this a try and drop affinity tools altogether. Right now, all GPU clients and the -bigadv client are set to priority=96 and modified through an affinity tool..
 
My GTX285's will occassionally drop down to 2D mode from 3D and cut production in half. Requires a restart and it is fine. I've never been able to figure out WHY it does it, but I know that restarting the rig fixes it. Don't know how to restart a GPU ;)

 
My GTX285's will occassionally drop down to 2D mode from 3D and cut production in half. Requires a restart and it is fine. I've never been able to figure out WHY it does it, but I know that restarting the rig fixes it. Don't know how to restart a GPU ;)
I seriously doubt it's 2D mode. MSI AF reports the same frequency for both GPUs and temps are nearly identical. I would expect if 2D mode kicked in both GPUs on a dual-GPU card would be affected, but I'm not entirely sure of that. I have restarted the system several times throughout this ordeal and the problem resurfaces in less than an hour. Also, as I posted earlier GPU3 is not affected. It's entirely a GPU2 client issue and only one of the two GPUs on this card has the problem, so it's definitely not 2D mode if these observations are correct.
 
im sure you know this but just in case. make sure the priority is higher than than the bigadv
 
im sure you know this but just in case. make sure the priority is higher than than the bigadv
Yep, it's been higher since I installed the clients.

BTW, another curious but more serious issue related to this situation surfaced yesterday that has me ever more baffled. Now, if I run two GPU2 clients on this card the system will lose video and my clients will freeze. That's right, my monitor will go blank! If I somehow manage to revive the display (not easy), both clients on this card EUE at that point and the work is lost. Note, this does not occur with GPU3... :confused:

So, I am now forced to run the card with GPU3 on one client and GPU2 on the other client as a compromise for minimum PPD loss, which is what I've been doing all week anyway to escape from the slowdowns and crashes. The only thing I haven't tried is a driver reinstall, but I did try a restore point from the first week of December before the issues began and that didn't help at all. :(
 
Doesn't sound good. It's not a BFG is it?

I still suggest nuking and re-installing Windows as the next step.
 
Doesn't sound good. It's not a BFG is it?
Yes, AAMOF it is. BFG GTX 295 single PCB.

I still suggest nuking and re-installing Windows as the next step.
Don't think I'll go that far unless GPU2 on the second client begins to give me the same issues. I am slated for a driver reinstall as soon as the current -bigadv completes. That will be my last bit of tampering with this system. It is not worth further hassle.

As it stands right now, I have effectively lost ~2000 PPD from moving over to the GPU3 client instead of GPU2. There are some minor albeit insignificant advantages as long as the problem does not become more serious whatever it is. The advantages of GPU3 is that it reduces temps by 2-3[FONT=&quot]°[/FONT]C on the chip that runs it, and the -bigadv TPFs are reduced by about a half minute depending on variables. IOW, I can live with the compromise even if I am perturbed and perplexed by the whole situation.

Another curious thing I had noticed and evidence something adverse was going on: whenever I had two GPU2 clients operating, the GPU usage monitoring graph in Afterburner for the problematic client displayed frequent sharp dips as opposed to an almost horizontal line for the second client. This does not occur if I switch to the GPU3 client on the problematic chip. Freaking weird. :confused:
 
I've seen odd dips in GPU usage in afterburner across a wide variety of GPU types and client types (GPU2/GPU3)
 
Just out of curiosity which system is it in - not your skulltrail by any chance??
 
I've seen odd dips in GPU usage in afterburner across a wide variety of GPU types and client types (GPU2/GPU3)
That's funny, because on all my other GPU clients I don't see these kinds of dips. By dips I mean all the way down to ~0% and it shoots back up, or the lines will be highly jagged. I think it depends on the WU. Suffice to say it is anything but a steady line I see with all my other clients, and when I switch over to GPU3. It is a big difference when viewed comparatively.

Just out of curiosity which system is it in - not your skulltrail by any chance??
Yes, you guys have my hardware down pat. :cool:
 
Just out of curiosity which system is it in - not your skulltrail by any chance??

Yes, you guys have my hardware down pat. :cool:

Not sure how possible this is but have you tried the card in a different system? At this time I am tempted to say that the card is OK and the problem lies somewhere else in the system.

From your recent postings i am guessing that most of your current problems stem from that 1 system:(
 
Not sure how possible this is but have you tried the card in a different system? At this time I am tempted to say that the card is OK and the problem lies somewhere else in the system.
It could very well be the case. Unfortunately, it is very difficult for me to find an appropriate system to install it in because of its size. And, I'm loathe to invest further troubleshooting effort on a system that has been nothing but one headache after another since I built it.

That said, I will reinstall the drivers seeing how that should only require ~15 minutes or thereabouts. If it works, great. If not, I'll stick with GPU3 on one client for the time being until this spring when it looks like I'll be leaving for reasons stated elsewhere ad nauseam.

From your recent postings i am guessing that most of your current problems stem from that 1 system:(
Yeah, alas, been in partial denial because of the two years of travail with it. At any rate, it didn't have weird GPU problems until this situation happened rather recently. My other problems with this system in the past stemmed from CPU performance.

Odd as it may seem because of the product's intended purpose, up until this summer I had relegated it to a quad-GPU platform due to the major CPU issues. I finally came upon a processor combo that worked as advertised, and I was able to run -bigadv for the first time only this past August. It was my fastest 'potential' system with all the OC features but never realized until then - VERY late in the game. :mad: :(

BAAAAAAAKE IT!!!!!!!!!!
Will a heat gun do...? :eek:
 
Make sure its a Big F'ing heatGun to ensure optimum results.
 
It could very well be the case. Unfortunately, it is very difficult for me to find an appropriate system to install it in because of its size. And, I'm loathe to invest further troubleshooting effort on a system that has been nothing but one headache after another since I built it.

That said, I will reinstall the drivers seeing how that should only require ~15 minutes or thereabouts. If it works, great. If not, I'll stick with GPU3 on one client for the time being until this spring when it looks like I'll be leaving for reasons stated elsewhere ad nauseam.

Yeah, alas, been in partial denial because of the two years of travail with it. At any rate, it didn't have weird GPU problems until this situation happened rather recently. My other problems with this system in the past stemmed from CPU performance.

Odd as it may seem because of the product's intended purpose, up until this summer I had relegated it to a quad-GPU platform due to the major CPU issues. I finally came upon a processor combo that worked as advertised, and I was able to run -bigadv for the first time only this past August. It was my fastest 'potential' system with all the OC features but never realized until then - VERY late in the game. :mad: :(

Will a heat gun do...? :eek:

Oven biatch!!!!!!

350 degrees of pain!
 
Yesterday night I reinstalled the nVidia drivers for this machine and at first the system had similar problems that I described earlier in the thread. I then went into the nVidia Control Panel and enabled multi GPU mode. After doing this, the problems seemed to end and I was finally able to run two GPU2 clients like I had done without slowdowns, crashes and monitor blanking for months prior to this issue.

Now the question is why? I distinctly remember having mutli GPU mode turned off before, when I had initially installed the card and never touched the setting thereafter. From what I've been reading this mode should be disabled. I even distinctly remember having multi GPU disabled when I was folding on my 9800 GX2 for half a year. There seems to be mixed advice on this setting for the GTX 295 and dual-GPU cards in general. :confused:
 
As the drivers have evolved, this compatibility of this setting with F@H has changed. Back in the old days (1-2 years ago) it was necessary to enable SLI to be able to fold on GTX 295s. Before that it was impossible in Vista/Win7 without hacking the registry, jerry-rigging dummy plugs, or other dubious methods. Then NVIDIA finally developed solid CUDA support in their Vista/Win7 drivers, and it was no longer necessary to enable SLI. However, there were some reports that SLI now caused problems, so it was recommended to disable SLI while folding. I don't know the status of the latest drivers, but you seem to indicate the problem is still there somewhat.
 
As the drivers have evolved, this compatibility of this setting with F@H has changed. Back in the old days (1-2 years ago) it was necessary to enable SLI to be able to fold on GTX 295s. Before that it was impossible in Vista/Win7 without hacking the registry, jerry-rigging dummy plugs, or other dubious methods. Then NVIDIA finally developed solid CUDA support in their Vista/Win7 drivers, and it was no longer necessary to enable SLI. However, there were some reports that SLI now caused problems, so it was recommended to disable SLI while folding. I don't know the status of the latest drivers, but you seem to indicate the problem is still there somewhat.
Yes, it apparently is, but when it feels like becoming a problem, if you catch my drift. IOW, it wasn't a problem for a couple of months and then apparently it became a problem. The driver set I've been using all along is v258.96. Now you know part of the reason I've been at the end of my rope with F@H getting comparatively crap compensation for a boatload of effort, and I still cannot resolve all of these unnecessary and undue entanglements.
 
Back
Top