Nehalem Clocks and Turbo Mode

virtualrain

Limp Gawd
Joined
Jul 31, 2005
Messages
182
I recently blogged this on Nehalem News and I'm interested in your thoughts...

A big question on everyone's mind is what CPU frequency will intial Nehalem parts be clocked at and what overclocking head room might exist?

Although a demo machine was recently reported to be running Nehalem at 3.2GHz, we can't be positive based on the evidence provided that this early sample was, in fact, running at that speed. However, it's not unreasonable for Bloomfield to launch at 3GHz speeds given that Penryn's highest binned parts are shipping at 3.2GHz. This is supported, in part, by the fact that Nehalem's 731 million transistors compares favorably to Penryn's 820 million (both at 45nm).

It seems plausible that Nehalem should theoretically clock just as well as Penryn at the same Thermal Design Power (TDP) with the same cooling solution. The big unknown is what effect the onboard Memory Controller Hub (MCH) will have on clock speed limitations.

Another intersting aspect to Nehalem is the reports of a "Turbo Mode". While published details are hard to come by, this dynamic core clocking capability is illustrated in the slide below (courtesy of HKEPC). It appears to be an extension of the Performance States of the Adavanced Configuration and Power Interface (ACPI) specification also know as SpeedStep technology (P-States). It suggests that when load allows one or more cores to be throttled down into a low frequency mode (LFM) the remaining active loaded cores can actually be overclocked to higher than default clock frequencies as long as the default TDP is not exceeded.

2008010318075171704201286.jpg

(source: HKEPC)

While this is an exciting development, unfortunately it raises more questions than it answers.... for example, it's not clear if this feature will work with overclocked chips or only those running at default clocks and multipliers. It's also not clear what events or conditions trigger the LFM for the unutilized cores and similarly what events or conditions trigger the overclocked P-states of the active cores. Finally, there's no insight into how much control the BIOS, OS, or end-user will have over this capability.

We can only hope that Nehalem's Turbo Mode follows the current SpeedStep implementation which can be managed at both the BIOS and/or Operating System level to allow modifying the multiplier in response to CPU loads. That is, if one overclocks their reference clock from default, this feature will ideally manage core clocks by adjusting the multi's up or down as load dictates thus allowing overclockers to benefit from this feature as much as factory clocked systems.

If this feature does support overclocking, it will add another dimension to stability testing as a stable operating point for all 4 cores will also have to consider the "Turbo Mode" state.

Needless to say, the benefits of this kind of dynamic overclocking with balanced performance on multiple cores for multi-threaded apps while also offering maximum clock speed on single-threaded programs, all with the same cooling solution, effectively ensures one can "have their cake and eat it too"!

Thoughts?

Cheers,
-Chris.
 
quick question, what do you do when nehalems out, rename your self sandtigernews.com or geshernews.com? had to ask.

current penryn clocks are based on the fact that AMD cant respond. We've all seen the overclocks, you know that means intel is hitting massive yield with the current 45nm generation. They could easily release a 3.6 or 3.4GHz SKU if need be, but there is no need, so why burden yourself with it? at 3.2GHz the Intel Core 2 is the fastest processor on the planet --even to the vast majority of consumers who dont know about the difference between and AMD clock and an Intel clock. I once had a dumbass practically demand a 3.6GHz P4 because, according to him, it was the fastest processor ever made :p.

And the only thing I can say is wait and see. With GPUs, because they're so massively parallel, the content is easier to understand and hence betters suited to forums. With CPUs, with a current maximum 6 parallel components, its harder for us to wrap our heads around it. How dependant on that point to point bus (I REFUSE to call it quickpath. Thats such a very very retarded name. Do you know how much more in-the-know you seem if you use the term "front side bus" over "quickpath"? If I had my way I'd still refer to the L2 as on the back side bus.).. erm, sorry, yeah how dependant on the bus is the final CPU core speed? If were looking at something similar to an AMD based system, overclocking nehalem is going to require quite a wicked good chipset, something I don't think the infancy of the new architecture is likely to yield.

furthermore, I think intels gotten themselves in pretty deep with an integrated memory controller and tri-channel ram. Donno how well the memory bus is going to clock with something that new.

Just going to have to wait and see.
 
I always wondered if the new thing is just having lots of cores on your processor. I mean it started out with the clock speed being the measure of performance, and then it was the FSB (largely) and other methods to make the clock cycles count more, and I think the next thing is going to be having more cores.

A bit more "on subject", I just think Intel is finally realizing that there aren't that many uses for so many cores yet. I'm just waiting for mainstream programs to all be massively multithreaded and then Intel can go ahead and make as many cores as they want. That being said, I think while the "Turbo Mode" is a nice addition now, I hope it's utterly useless in a couple years.
 
furthermore, I think intels gotten themselves in pretty deep with an integrated memory controller and tri-channel ram. Donno how well the memory bus is going to clock with something that new.

Just going to have to wait and see.

Intel has been making memory controllers as long as CPU's... the fact that it's now on the same piece of silicon as the CPU shouldn't adversely affect it's performance...in fact, quite the opposite. Now you don't need to contend with GTL biasing between entities... it should be a lot easier to overclock a Nehalem with the right cooling.
 
IMO given the description of this function it sounds like it wont have any bearing in the overclocking community. You'd have to be overly optimisitic to think that Intel is going to go through the efforts of making this work for the overclocking community. Even still if they did I just dont see how it would possibly work given that the tdp is lowest common donominator here. Any decent oc is going to throw the default tdp of a chip right out of the door. And according to the description it only works so long as the tdp is not surpassed.
 
overclocking is a biproduct of R&D weather you want it to be or not. Intel wants the yield as high as possible, thus yielding a gap between the processors rated speed, and its maximum speed.
 
IMO given the description of this function it sounds like it wont have any bearing in the overclocking community. You'd have to be overly optimisitic to think that Intel is going to go through the efforts of making this work for the overclocking community. Even still if they did I just dont see how it would possibly work given that the tdp is lowest common donominator here. Any decent oc is going to throw the default tdp of a chip right out of the door. And according to the description it only works so long as the tdp is not surpassed.

You may have a point.. but we don't know how sophisticated the limit will be on this Turbo Mode. It may simply be that it will allow a default x9 multi to switch up to x12 on one core if the others are in LFM mode. For example, you may have your rig overclocked to say 400x9 (3.6GHz) and then when two or more cores are idle, they can be shutdown and the remaining two loaded cores simply bumped to a x11 multi (400x11 = 4.4GHz). That would be pretty advantageous to overclockers who may want the best of both worlds... max clock speed on single-threaded apps as well as max clocks on all cores for multi-threaded.

This changing of multi's is exactly how SpeedStep works today... except if the whole CPU is idle, the multi drops to x6. SpeedStep still works fine on overclocked CPU's... in fact it can be a real bonus for people who value silence for 24/7 operation. When the CPU is idle, it will work at 400x6 (following the example above) and then ramp up to 400x9 under load.
 
This Turbo Mode is very similar, if not the same thing, as Dynamic Acceleration in the Santa Rosa mobile line.
 
Could be a whole new approach with intel.

The sell you a binned part rated for X-ghz. that is the minimum speed you get. using a version of Speed step, the processor throttles to the maximum speed it can reach under TDP, an automatic OC.

Basically you wont have to OC the part, it does it for you, safetly and within specs. You are paying for the minumum level of performance, not the maximum as we have it now.
 
Maybe it is this feature that will block overclocking in lower end Nehalems ? And the high end will be possible to OC only by open multiplier ?
 
Thanks... that's exactly the same concept as the Nehalem Turbo Mode I wrote about above. Now the question is, how well does it work in practise? ;)
Users have reported a performance improvement in single-threaded CPU benchmarks (SuperPi,etc). In a real-world application you're likely not to notice the difference; the mulitplier is only incremented by 1, so a 200MHz increase in core clock is all you're looking at.
 
Could be a whole new approach with intel.

The sell you a binned part rated for X-ghz. that is the minimum speed you get. using a version of Speed step, the processor throttles to the maximum speed it can reach under TDP, an automatic OC.

Basically you wont have to OC the part, it does it for you, safetly and within specs. You are paying for the minumum level of performance, not the maximum as we have it now.

I still think you'll end up OC'ing it. It won't "turbo" itself up THAT much, and there will always be those that want to push it even harder. Besides, as far as I understand it, isn't Intel going to only step it up by so much, regardless of whether or not it *can* go faster?
 
I myself have not ventured into the overclocking world (yet), but i would not like the fact that i CANT do it because my CPU was neutered and only the Nehalem Xtreme! can do it.:mad:
 
I still think you'll end up OC'ing it. It won't "turbo" itself up THAT much, and there will always be those that want to push it even harder. Besides, as far as I understand it, isn't Intel going to only step it up by so much, regardless of whether or not it *can* go faster?

The way it reads is that the CPU will overclock one or two cores for single threaded apps when it can shutdown one or two cores in order to keep the TDP within spec. So, yes, it will "turbo" itself up under certain circumstances.

What the other guy was pointing out is that Intel will have to validate the CPU so that all cores can run in the "turbo" mode, thus ensuring that every processor comes with some level of built-in overclocking headroom by design.

How much headroom will there be? It may only be a x1 increase in multi... not much but that's better than nothing.

Of course, everyone with good aftermarket cooling will immediately be able to boost every core to this "turbo" speed on a 24/7 basis. The question I have, is once overclocked, will this "turbo" feature still work? For example, if my factory clock is 200x16 (3.2GHz) and it has this turbo feature, then it should be a no brainer to overclock it to 3.4GHz since each core was validated to run at that speed anyway. So lets say you do this and run it at 212x16... will you then get 212x17 in turbo mode on one or two cores when running single threaded apps without any increase in heat dissipation? What if you run it overclocked at 3.6GHz (225x16) will you then get a couple of cores running at 225x17 (3.825GHz) in turbo mode? If so... sweet! :cool:
 
Back
Top