MSI P67A-GD65: Burn In Stress Test Hard Freeze on 7th hour

OldM3ta

[H]ard|Gawd
Joined
Jun 6, 2004
Messages
1,150
I experienced a hard screen lock while stress testing my machine over night. The image to the screen was frozen. Mouse didn't move. System was unresponsive. Windows System Log for the freeze time was empty with no Errors or Critical Notices. I would like some comments please on what could have been the cause. Also, I would appreciate advice on how to resolve it so it doesn't happen on the next stress test.

*Hardware*
Intel Sandy Bridge i5-2500K
Corsair H50 cooler with Noctua NT-H1 Thermal Compound
MSI P67A-GD65 (B3) Latest BIOS 1.8
G.SKILL Ripjaw (4x2GB) DD3 1600 (F3-12800CL9D-4GBRL x 2)
MSI N580GTX Twin Frozr II OC (1.025 core voltage)
Turtle Beach Montege DDL PCI Sounds Card
GW-USNano USB WiFi-N device
Crucial C300 128GB SSD (o/s) - Attached to Sata III - AHCI
Seagate ST310000528AS 1TB HDD (page file, data) - Attached to Sata II - AHCI
Corsair CMPSU-850TX
Cooler Master CM 690 II Advanced Case with seven case fans

*Software*
Windows 7 Home Premium x64 SP1 Updated to 3/15/2011
- High Performance Power Plan, Turn off HD Never, USB Selective Suspend Setting Enabled, PCI Express LSPM Off
- Page file 2GB set on HDD, Hibernation and System Recovery Off
- Microsoft Security Essentials with Scheduled and Realtime Protection disabled
- DirectX June 2010 Redist
- Screen saver disabled, Turn off Display Never, Sleep Never
Intel Management Engine Components 7.0.0.1144
Intel Rapid Storage Technology 10.1.2.1004 with LPM disable registry fix
NVIDIA Geforce 266.58 WHQL
Real Temp 3.67 ( shows multiplier go from 17.0 to 41.5 [why 41.5?] )
Prime95 x64 26.5b5
MSI Afterburner 2.1.0
MSI Kombustor 2.0.0 (Stress tests GPU in windowed-mode, I selected OpenGL 3.3, 1600x900 window, 16xAA)

*Stress test*
Prime95 Torture Test All Four Cores in Blend Mode ("tests some of everything, lots of RAM tested")
MSI Kombustor Burn In Test Running 3D API GL3, 1600x900 Windows, 16xAA
RealTemp Running
MSI Afterburner Running

!!!!!!!!!*Stress test results*!!!!!!!!!!!!
Hard freeze system lock at 07:00:02 time stress testing.
RealTemp showed lowest core temperature at 32C, highest core temperature at 64C, average core temperature 56C on load
Afterburner showed GPU full load temp steady at 60C
Prime95 - All cores appear to have reached seventh hour of testing before the freeze. The last thing I see on the screen is:
"Self-Test 2800K Passed" and then about two minutes after that, the freeze seemed to have occurred while testing next set.

*BIOS Settings*
CPU Base Frequency (10KHz) 10000
CPU Ratio 33
Adjust CPU Ratio in O/S Enabled
Intel PLL Overvoltage Auto
EIST Enabled
Intel Turbo Boost Enabled
DRAM Frequency Auto
XMP Enabled
Adjust DRAM Frequency 1600MHz
Spread Spectrum Enabled
VDroop Control HighV vDroop
CPU Core Voltage 1.2v
I/O Voltage Auto
DRAM Voltage Auto
System Agent Voltage Auto
CPU PLL Voltage Auto
Active Processor Cores ALL
Limit CPUID Max Disabled
Execute Disable Bit Enabled
Intel Virtualization Technology Enabled
Power Technology Custom
EUP2013 Enabled
CPU Phase Control SVID
C1E Disabled
Intel C-State Enabled
Overspeed Protection Enabled
C-State Limit 70
Long Duration Power Limit 95W
Long Duration Maintained 1000ms
Short Duration Power Limit 118W
Primary Plane Turbo Power Limit 0
Ratio Limit 1-Core 37
Ratio Limit 2-Core 36
Ratio Limit 3-Core 35
Ratio Limit 4-Core 34
 
The system was running at *stock* when it crashed? 3.3 ghz and 1.2 vcore?

Why were you stress testing a stock system, anyway? That seems a bit bizarre, unless you suspect something is defective.... most of us stress overclocked systems.

"Usually" a complete hard lock like that without a BSOD is vcore or power related, unless the #THERMTRIP flag gets set at 115C. Did you check your +12v line?

Try disabling spread spectrum, C-state, and EIST AND overspeed protection. Then we can see if possibly one of those settings is causing a problem.
And why was MSI afterburner running? If you're trying to stress test the CPU and mainboard, you don't need AB running.
Just noticed Kombuster was running. Ok, looks like you simply exceeded the power draw somewhere or might be related to video card here. Try another test with ONLY prime95 running. I bet it won't lock up....
 
The system was running at *stock* when it crashed? 3.3 ghz and 1.2 vcore?

Looks like something set my multiplier up to 41.5x. Might have been that I pressed the OC Genie button on this board once when the bios wasn't at default. Althought I didn't start the machine with the button enabled, it seems to possibly have messed with the system, and allowed for overclocking. I don't think that was the issue though. Some people have suggested elsewhere that it was tied to my RAM being set to XMP. They say that having the RAM at 1600MHz actually overclocks the processor's memory controller, which is rated stock at 1333MHz. I had left the voltage on the processor at 1.2v fixed.


Why were you stress testing a stock system, anyway? That seems a bit bizarre, unless you suspect something is defective.... most of us stress overclocked systems.

This will be a mission critical system. So I need through a few 24 hour stress tests where GPU and CPU are maxed out to see if it can be claimed full stable.


"Usually" a complete hard lock like that without a BSOD is vcore or power related, unless the #THERMTRIP flag gets set at 115C. Did you check your +12v line?
The bios shows at around 12.14v. I didn't have an in O/S software app to show me the voltages on my rails. Know a good one to use with the i5 processors? I don't have a multimeter. I doubt the heat got up to 115C. I saw the temperature reported and it was at a smooth 62C maxed.

Try disabling spread spectrum, C-state, and EIST AND overspeed protection. Then we can see if possibly one of those settings is causing a problem.
OK, I'll give that a shot! Others have asked that I turn off RAM XMP, too.

And why was MSI afterburner running? If you're trying to stress test the CPU and mainboard, you don't need AB running. Just noticed Kombuster was running. Ok, looks like you simply exceeded the power draw somewhere or might be related to video card here. Try another test with ONLY prime95 running. I bet it won't lock up....

It may not that way, but I need the two main processors in the machine working 100% or near that for 24 hours before I can call it stable.


LAST THING: THANKS A LOT! I appreciate your involvement and suggestions.
 
The first thing to do is to determine if it is CPU, RAM or GPU related. From my experience most problems are GPU related with defective RAM at the second spot. Just run one benchmark at once and see if it still locks up. A GTX 580 and a SNB processor, both at stock clocks, should not exceed a 850W PSU's specs.
 
Yeah, we understand you want it stable with all components under load... But if you want to rule stuff out you also need to test individually. I'm setting up my own MSI GD55 this weekend btw, I'll share my results.
 
Yeah, we understand you want it stable with all components under load... But if you want to rule stuff out you also need to test individually. I'm setting up my own MSI GD55 this weekend btw, I'll share my results.

^^This.
Always test each component individually first.
Then, once you are sure each part has passed your expectations, then you can combine them. That way, if something fails, you know where to start troubleshooting.
 
I'd first try testing with MemTest86/MemTest86+ and Gold Memory overnight.

Slowing the memory way down and bumping its voltage to 1.65V should help if there's a memory problem, but if it does, it proves that the memory is junk. Something is wrong with memory with a recommended voltage higher than 1.5V.
 
Update: At some point of my build, I pressed the MSI OC Genie Button (a motherboard push button). I didn't leave it on, instead just turned it off again. I did that with the power off. Somehow though, it must have caused the bios to think it was supposed to O/C my windows. So I reset the CMOS, and then I disabled spread spectrum, C-state, and EIST AND overspeed protection (as suggested above), and set my XMP to disabled. Also, I noticed that AUTO for the DRAM voltage for these G.SKILL sticks was setting it to 1.488v, which is not spec. It didn't have 1.5v flat, so I set the voltage to 1.507v. I then ran a Prime 95 blend test (lots of RAM) for four hours. When I saw that worked, I rebooted, went to the bios and put XMP back on which set the RAM frequency back up to 1600MHz. I ran Memtest for 8 hours, no problems. The processor was no longer operating at 41.5x multiplier under load, so I loaded up Windows and ran my combined Kombustor OpenGL 1600x900 windows 16xAA bench burn-in, along with Prime95 for 9 hours. It went one hour further than when I had the lock up.

Someone mention there was a bios out there that was higher than mine, but in beta, so I thought to avoid it. Anyhow, grabbing a little from all the responses seemed to solve my issue. I'll have to see how I can overclock the CPU now at get back to testing at or around 4GHz which is where I want to be. But I think this time, it was a case of the RAM being underfed at 1.488v and maybe spread spectrum being enabled or something.

Thanks for all the responses.
 
Glad you got it working. And you should be fine at 40x100 with a vcore of 1.25v, with or without loadline calibration enabled (droop control on msi boards).
 
Back
Top