Looks like I just lost a 9800GX2 :(

Elledan

[H]ard|DCer of the Month - April 2010
Joined
Oct 18, 2001
Messages
15,913
See log:
Code:
[07:26:37] Loaded queue successfully.
[07:26:37] Initialization complete
[07:26:37] - Preparing to get new work unit...
[07:26:37] + Attempting to get work packet
[07:26:37] - Connecting to assignment server
[07:26:38] - Successful: assigned to (171.64.65.71).
[07:26:38] + News From Folding@Home: Welcome to Folding@Home
[07:26:38] Loaded queue successfully.
[07:26:43] + Closed connections
[07:26:43] 
[07:26:43] + Processing work unit
[07:26:43] Core required: FahCore_11.exe
[07:26:43] Core found.
[07:26:43] Working on queue slot 08 [November 4 07:26:43 UTC]
[07:26:43] + Working ...
[07:26:43] 
[07:26:43] *------------------------------*
[07:26:43] Folding@Home GPU Core
[07:26:43] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[07:26:43] 
[07:26:43] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[07:26:43] Build host: amoeba
[07:26:43] Board Type: Nvidia
[07:26:43] Core      : 
[07:26:43] Preparing to commence simulation
[07:26:43] - Looking at optimizations...
[07:26:43] DeleteFrameFiles: successfully deleted file=work/wudata_08.ckp
[07:26:43] - Created dyn
[07:26:43] - Files status OK
[07:26:43] - Expanded 81805 -> 421543 (decompressed 515.3 percent)
[07:26:43] Called DecompressByteArray: compressed_data_size=81805 data_size=421543, decompressed_data_size=421543 diff=0
[07:26:43] - Digital signature verified
[07:26:43] 
[07:26:43] Project: 10111 (Run 573, Clone 6, Gen 4)
[07:26:43] 
[07:26:43] Assembly optimizations on if available.
[07:26:43] Entering M.D.
[07:26:49] Tpr hash work/wudata_08.tpr:  1923135042 1456662805 2701169512 640929405 384098567
[07:26:49] 
[07:26:49] Calling fah_main args: 14 usage=100
[07:26:49] 
[07:26:50] Working on 1174 p10111_ubiquitin_300K
[07:26:53] Client config found, loading data.
[07:26:54] Starting GUI Server
[07:26:54] Run: exception thrown during GuardedRun
[07:26:54] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[07:26:54] Going to send back what have done -- stepsTotalG=10000000
[07:26:54] Work fraction=0.0000 steps=10000000.
[07:26:58] logfile size=10480 infoLength=10480 edr=0 trr=23
[07:26:58] + Opened results file
[07:26:58] - Writing 11016 bytes of core data to disk...
[07:26:58] Done: 10504 -> 3906 (compressed to 37.1 percent)
[07:26:58]   ... Done.
[07:26:58] DeleteFrameFiles: successfully deleted file=work/wudata_08.ckp
[07:26:58] 
[07:26:58] Folding@home Core Shutdown: UNSTABLE_MACHINE
[07:27:01] CoreStatus = 7A (122)
[07:27:01] Sending work to server
[07:27:01] Project: 10111 (Run 573, Clone 6, Gen 4)
[07:27:01] - Read packet limit of 540015616... Set to 524286976.


[07:27:01] + Attempting to send results [November 4 07:27:01 UTC]
[07:27:02] + Results successfully sent
[07:27:02] Thank you for your contribution to Folding@Home.
[07:27:06] - Preparing to get new work unit...
[07:27:06] + Attempting to get work packet
[07:27:06] - Connecting to assignment server
[07:27:07] - Successful: assigned to (171.64.65.71).
[07:27:07] + News From Folding@Home: Welcome to Folding@Home
[07:27:07] Loaded queue successfully.
[07:27:09] + Closed connections
[07:27:14] 
[07:27:14] + Processing work unit
[07:27:37] Core required: FahCore_11.exe
[07:27:37] Core found.
[07:27:37] Working on queue slot 09 [November 4 07:27:37 UTC]
[07:27:37] + Working ...
[07:27:47] 
[07:27:47] *------------------------------*
[07:27:47] Folding@Home GPU Core
[07:27:47] Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
[07:27:47] 
[07:27:47] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 
[07:27:47] Build host: amoeba
[07:27:47] Board Type: Nvidia
[07:27:47] Core      : 
[07:27:47] Preparing to commence simulation
[07:27:47] - Looking at optimizations...
[07:27:47] DeleteFrameFiles: successfully deleted file=work/wudata_09.ckp
[07:27:47] - Created dyn
[07:27:47] - Files status OK
[07:27:47] - Expanded 81873 -> 421543 (decompressed 514.8 percent)
[07:27:47] Called DecompressByteArray: compressed_data_size=81873 data_size=421543, decompressed_data_size=421543 diff=0
[07:27:47] - Digital signature verified
[07:27:47] 
[07:27:47] Project: 10111 (Run 896, Clone 4, Gen 20)
[07:27:47] 
[07:27:47] Assembly optimizations on if available.
[07:27:47] Entering M.D.
[07:27:53] Tpr hash work/wudata_09.tpr:  2998368304 2595630246 3535247247 900054740 1182158445
[07:27:53] 
[07:27:53] Calling fah_main args: 14 usage=100
[07:27:53] 
[07:27:54] Working on 1174 p10111_ubiquitin_300K
[07:27:54] mdrun_gpu returned 
[07:27:54] Self-test failure
[07:27:54] 
[07:27:54] Folding@home Core Shutdown: UNSTABLE_MACHINE
[07:27:57] CoreStatus = 7A (122)
[07:27:57] Sending work to server
[07:27:57] Project: 10111 (Run 896, Clone 4, Gen 20)
[07:27:57] - Read packet limit of 540015616... Set to 524286976.
[07:27:57] - Error: Could not get length of results file work/wuresults_09.dat
[07:27:57] - Error: Could not read unit 09 file. Removing from queue.
[07:27:57] - Preparing to get new work unit...
[07:27:57] + Attempting to get work packet
[07:27:57] - Connecting to assignment server
[07:27:58] - Successful: assigned to (171.64.65.71).
[07:27:58] + News From Folding@Home: Welcome to Folding@Home
[07:27:58] Loaded queue successfully.
[07:28:00] + Closed connections
[07:28:05]

This continues like this until the EUE limit is reached. A few days ago both clients were hanging, not EUEing but just pretending to be busy. That got fixed after I restarted both clients.

I didn't change anything about this system or its configuration, temperatures haven't been this cool before I added the two front intake fans a few weeks ago and the SMP client on the Q6600 runs without issues. The 9800GX2 is not OCed or tweaked in any way aside from having its shell removed (while wearing an anti-static bracelet and everything).

What could cause this card to start EUEing like this all of a sudden? Is it really dying?
 
1 reason.. g92 gpu.. looks like its time to....



wait for it....



BAKE IT!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 
what are the temps? I have a large high-cfm 120mm fan over my 2x 9800GX2's and they put out a stupid amount of heat
 
Last edited:
When I ran my GX2s I had to force the fans to 100% output to prevent the EUE. I later had to bake one of the GX2s, it turned out ... delicious.
 
what are the temps? I have a large high-cfm 120mm fan over my 2x 9800GX2's and they put out a stupid amount of heat

Both cores were in the high 70s after I added those front intake fans.
 
I notice it is failing on one of the newer WUs. I'd try lowering your overclock by one step and see if that resolves the problem. Could be something about this new WU and your OC.
 
a quick google reserch and it seems some people that have had this problem deleated the client.cfg file as well as work and queue folders. restrting the client after and the cards would be working again.

although with a g92 it could be on its way out. i quick cook in the oven or even on a hotplate may help
 
I notice it is failing on one of the newer WUs. I'd try lowering your overclock by one step and see if that resolves the problem. Could be something about this new WU and your OC.

Like I said I don't OC this card :)

I'll try deleting the files mrbigshot pointed out.
 
1 reason.. g92 gpu.. looks like its time to....



wait for it....



BAKE IT!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

OMFG, stop stealing my quotes! :)

Kidding, kidding..... but seriously







BAKE IT!
 
I have one to sell you for cheap prices if this one turns out to be bad :)
 
Just don't bake the card at the same time you're baking cookies ;)
 
One forced reboot (kicks cheap UPS) later, and the 9800GX2 is happily folding once more :)

So much for the peace and quiet in my room :(
 
Back
Top