GPU folding problem

kirk121

Limp Gawd
Joined
Sep 20, 2007
Messages
159
Hi I am running Server 2003 running latest drivers and windows is updated to latest patches, I am running a ati M3A78 motherboard with a AM2 dual core processor and a Nvidia 9800gt video card. Every now and the I get a gpu error where the gpu folding client, it seems to hang and when I restart the client I get the error that it cant find the cudart.dll all I have to do is to is log out and log back in and everything works fine. I am running the 181.20 video drivers for my card, I am wondering which drivers would be the best for server 2003 32 bit since nvidia does not seem to make any drivers for server 2003 othere than for the 64 bit version.

And with some of the newer projects for the gpu client I have been getting
mdrun_gpu returned
NANs detected on GPU
Folding@home Core Shutdown: UNSTABLE_MACHINE
CoreStatus = 7A (122)
Sending work to server

What I find strange is that my server can run fine for months with no problems, and I am running the latest gpu client version. It does seem to happen the most when I get the work units Project: 5904, Project: 5903, and some times Project: 5902.

Do you think it is a driver problem? or something with the new work units coming out?


Thanks
 
the cudart.dll things an easy fix.. if you go to documents and settings/user name/application data/foldingathome-gpu or what ever your folders called.. in there you will see the cudart.dll just put it in the main F@H gpu folder where you installed it..
 
Ok thanks I will try that, I was wondering what are the best drivers to use for my OS since it is not xp. I just want to make sure that it is not a driver problem that is causing my problems.

thanks
 
I'm running into GPU folding failures as well.

I re-read the "How-To" guide and realized I didnt have my client configs set to "big".

Now I'm seeing:

[15:16:45] mdrun_gpu returned
[15:16:45] Nonzero force sum on GPU
[15:16:45]
[15:16:45] Folding@home Core Shutdown: UNSTABLE_MACHINE
[15:16:47] CoreStatus = 7A (122)
[15:16:47] Sending work to server
[15:16:47] Project: 5738 (Run 3, Clone 64, Gen 77)
[15:16:47] - Read packet limit of 540015616... Set to 524286976.
[15:16:47] - Error: Could not get length of results file work/wuresults_07.dat
[15:16:47] - Error: Could not read unit 07 file. Removing from queue.

Its happened once on one GPU of my X2, its happened 3 times on hers, and I lost an 89% work done on one of my 2600's.
 
I'm running into GPU folding failures as well.

I re-read the "How-To" guide and realized I didnt have my client configs set to "big".

Now I'm seeing:

[15:16:45] mdrun_gpu returned
[15:16:45] Nonzero force sum on GPU
[15:16:45]
[15:16:45] Folding@home Core Shutdown: UNSTABLE_MACHINE
[15:16:47] CoreStatus = 7A (122)
[15:16:47] Sending work to server
[15:16:47] Project: 5738 (Run 3, Clone 64, Gen 77)
[15:16:47] - Read packet limit of 540015616... Set to 524286976.
[15:16:47] - Error: Could not get length of results file work/wuresults_07.dat
[15:16:47] - Error: Could not read unit 07 file. Removing from queue.

Its happened once on one GPU of my X2, its happened 3 times on hers, and I lost an 89% work done on one of my 2600's.
Try setting the Folding@Home .exe file to XP compatibility mode. That fixed a similar issue I had.
 
I'll keep that in mind for the windows 7 boxes.

The 2600 that failed with that error is XP however :)
 
I've gone ahead and set the XP Compatibility mode on the .EXE's of the X2's, as I lost the jobs on them again.

I'll update on the results!
 
Yeah, the compatibility mode is making it worse. Hers won't even make it to 1% and mine is dumping at 6%.

This is becoming quite the challenge :)
 
I dunno then. Try deleting the Fahcore_11.exe files and letting the clients redownload them. Also make sure that you're using the latest drivers, and that all the cards are set to stock speeds. If that fails, I'm out of troubleshooting ideas.
 
Well back on topic, is there any set of drivers that I should be using to make server 2003 more stable with gpu folding?
 
Back
Top