AMD Gives More Zen Details: Ryzen, 3.4 GHz+, NVMe, Neural Net Prediction, & 25 MHz Boost Steps
by Ian Cutress on December 13, 2016 4:00 PM ESTIn the eternal battle to drive more details out of AMD ahead of the full launch of its new Zen microarchitecture based CPUs, today AMD is lifting the lid on some new features in order to whet the appetite (and appease the hype-train, perhaps) and that will be part of the product launch. We now have new details on the brand naming, some platform details, and a high-level overview of what will be the key points being promoted when it comes to market.
We’ve covered a lot of Zen, from the initial announcement to some of the microarchitecture details at Hot Chips through to discussing the utility of singular benchmark data and then what might be happening on the server side through a detailed analysis of motherboards on display. A lot of us want it out already, and when it does, it will come out under the brand ‘Ryzen’.
Ryzen and AM4
It is pronounced ‘Rye-zen’, not ‘Riz-zen’, to clarify.
As expected, there will be several SKUs in the brand, although AMD is not releasing many details aside from the cache arrangement of the 8-core, thread chip (which we already knew was 4MB of L2 + [8+8] MB of L3 victim-cache), and that the base clock for the high-end SKU will be at least 3.4+ GHz. The fact that AMD says ‘at least’ dictates that they are still deciding exactly what to do here, although a similar thing was said leading up to the launch of Polaris-based RX cards (though that’s a different department).
We know that Ryzen will use the AM4 platform, shared with the previous generation Bristol Ridge which remains an OEM-only product for now. We’ve gone into detail about how AM4 will operate, using a split IO design between the CPU and the chipset such that for minimal function, a chipset is not needed, however AMD has pointed out that with Ryzen, AM4 with the right chipset will support USB 3.1 Gen 2 (10 Gbps), NVMe SSDs, SATA-Express, and offer ‘ultimate upgradability’. The latter point may give an indication to the Ryzen based chipsets might offer numerous PCIe lanes, similar to what Intel does on the 100-series. That said, Intel has been developing that feature over years, and the Bristol Ridge chipsets for AM4 that have been announced already are not quite up to par with that, so it will be interesting to see.
We’re still waiting for detailed information on PCIe lane counts on Ryzen, how big that micro-op cache is in the core, if the L3 victim cache has limitations, how good the DDR4 controller is, power consumption, and what exactly the single core performance / IPC level is. Actually AMD did go into more detail with a few of these areas as well.
Power, Performance and Pre-Fetch: AMD SenseMI
Part of the demo in the pre-brief was a Handbrake video transcode, a multithreaded test, showing a near-identical completion time between a high-frequency Ryzen without turbo compared to an i7-6900K at similar frequencies. This mirrors the Blender test we saw back in August, although using a new benchmark this time but still multi-threaded. AMD also fired up some power meters, showing that Ryzen power consumption in this test was a few watts lower than the Intel part, implying that AMD is meeting its targets for power, performance and as a result, efficiency. The 40%+ improvement in IPC/efficiency is still being thrown around, and AMD seems confident that this target has been surpassed.
To that extent, at the pre-briefing, Ryan was shown two systems running Titan X graphics cards in SLI and Battlefield 1 at 4K settings - one system was running Ryzen, and the other an i7-6900K (the 8-core Broadwell-E chip). Ryan was unable to determine an obvious visual difference between the two frame-rate wise, which was the point of the demo.
Mark Papermaster, CTO of AMD, explained during our briefing that during the Zen design stages, up to 300 engineers were working on the core engine with an aggressive mantra of higher IPC for no power gain. This is not an uncommon strategy for core designs. Part of this will be down to two new power modes, that adjust and extend the power/frequency curve, which are part of AMD’s new 5-stage ‘SenseMI’ technology.
SenseMI Stage 1: Pure Power
A number of recent microprocessor launches have revolved around silicon-optimized power profiles. We are now removed from the ‘one DVFS curve fits all’ application for high-end silicon, and AMD’s solution in Ryzen will be called Pure Power. The short explanation is that using distributed embedded sensors in the design (first introduced in bulk with Carrizo) that monitor temperature, speed and voltage, and the control center can manage the power consumption in real time. The glue behind this technology comes in form of AMD’s new ‘Infinity Fabric’.
‘What is this new Infinity Fabric?’ I hear you say. It was only explained in the context of that it provides control and through the Infinity System Management Unit it can adjust power consumption while keeping in mind everything else that’s happening. The fact that it’s described as a fabric suggests that it goes through the entire processor, connecting various parts together as part of that control. Whether this is something wildly different to what we saw in Carrizo, aside from being the next-gen power adjustment and under a new name, is hard to determine at this point but we are probing for more details.
The upshot of Pure Power is that the DVFS curve is lower and more optimized for a given piece of silicon than a generic DVFS curve, which results in giving lower power at various/all levels of performance. This in turn benefits the next part of SenseMI, Precision Boost.
SenseMi Stage 2: Precision Boost
For almost a decade now, most commercial PC processors have invoked some form of boost technology to enable processors to use less power when idle and fully take advantage of the power budget when only a few elements of the core design is needed. We see processors that sit at 2.2 GHz that boost to 2.7 GHz when only one thread is needed, for example, because the whole chip still remains under the power limit. AMD is implementing Precision Boost for Ryzen, increasing the DVFS curve to better performance due to Pure Power, but also offering frequency jumps in 25 MHz steps which is new.
Precision Boost relies on the same Infinity Control Fabric that Pure Power does, but allows for adjustments of core frequency based on performance requirements and suitability/power given the rest of the core. The fact that it offers 25 MHz steps is surprising, however.
Current turbo control systems, on both AMD and Intel, are invoked by adjusting the CPU frequency multiplier. With the 100 MHz base clock on all modern CPUs, one step in frequency multiplier gives 100 MHz jump for the turbo modes, and any multiple of the multiplier can be used on the basis of whole numbers only.
With AMD moving to 25 MHz jumps in their turbo, this means either:
- The base frequency has reduced down to 25 MHz and AMD is able to implement a 136x multiplier to reach 3.4 GHz, or
- AMD can implement fractional multipliers, similar to how processors in the early 2000s were able to negotiate 0.5x multiplier jumps, or
- Precision Boost only applies to internal clocks that the user doesn’t see or control, but can assist with performance.
Without additional information, the second point in that list seems more in line with what would be possible. If we consider that Zen’s original chief designer was Jim Keller (and his team), known for a number of older generation of AMD processors, a similar technology might be in play here. If/when we get more information on it, we will let you know.
SenseMi Stage 3: Extended Frequency Range (XFR)
The main marketing points of on-the-fly frequency adjustment are typically down to low idle power and higher performance when needed. The current processors on the market have rated speeds on the box which are fixed frequency settings that can be chosen by the processor/OS depending on what level of performance is possible/required. AMD’s new XFR mode seems to do away with this, offering what sounds like an unlimited bound on performance.
The concept here is that, beyond the rated turbo mode, if there is sufficient cooling then the CPU will continue to increase the clock speed and voltage until a cooling limit is reached. This is somewhat murky territory, though AMD claims that a multitude of different environments can be catered for the feature. AMD was not clear if this limit is determined by power consumption, temperature, or if they can protect from issues such as a bad frequency/voltage setting.
By the sounds of it, this is a dynamic adjustment rather than just another embedded look-up table such as P-states. AMD states that XFR is a fully automated system with no user intervention, although I suspect it will still have an on/off switch in the BIOS. It also somewhat negates overclocking if your cooling can support it, which then brings up the issue for overclocking in general: casual users may not ever need to step into the overclocking world if the CPU does it all automatically.
I imagine that a manual overclock will still be king, especially for extreme overclockers competing with liquid nitrogen, as being able to personally fine tune a system might be better than letting the system do it itself. It can especially be true in those circumstances, as sensors on hardware can fail, report the wrong temperature, or may only be calibrated within a certain range.
It does raise the question as to how overclockable Ryzen will be, how many SKUs will be unlocked, or if XFR may only be on certain processors. As the Zen microarchitecture is destined for server and mobile as well, XFR will have different connotations for both of those markets (some of which might not be welcome).
SenseMi Stage 4+5: Neural Net Prediction and Smart Prefetch
Every generation of CPUs from the big companies come with promises of better prediction and better pre-fetch models. These are both important to hide latency within a core which might be created by instruction decode, queuing, or more usually, moving data between caches and main memory to be ready for the instructions. With Ryzen, AMD is introducing its new Neural Net Prediction hardware model along with Smart Pre-Fetch.
AMD is announcing this as a ‘true artificial network inside every Zen processor that builds a model of decisions based on software execution’. This can mean one of several things, ranging from actual physical modelling of instruction workflow to identify critical paths to be accelerated (unlikely) or statistical analysis of what is coming through the engine and attempting to work during downtime that might accelerate future instructions (such as inserting an instruction to decode into an idle decoder in preparation for when it actually comes through, therefore ends up using the micro-op cache and making it quicker).
Modern processors already do decent jobs when repetitive work is being used, such as identifying when every 4th element in a memory array is being accessed, and can pull that data in earlier to be ready in case it is used. The danger of smart predictors however is being overly aggressive – pulling in too much data that old data might be ditched because it’s never used (over prediction), pulling in too much data such that it’s already evicted by the time the data is needed (aggressive prediction), or simply wasting excess power with bad predictions (stupid prediction…).
AMD is stating that Zen implements algorithm learning models for both instruction prediction and prefetch, which will no doubt be interesting to see if they have found the right balance of prefetch aggression and extra work in prediction.
It is worth noting here that AMD will likely draw upon the increased L3 bandwidth in the new core as a key element to assisting the prefetch, especially as the shared L3 cache is a victim cache and designed to contain data already used/evicted to be used again at a later date.
More Details
AMD did confirm that the launch for Ryzen is still Q1, and Naples (the server counterpart for the Zen microarchitecture) is still on for Q2.
Today, AMD is putting on a Livestream called ‘New Horizon’, where all this information is being formally released. I’m at the event live, hopefully running a live blog, and I will try to get some extra time with an engineer that walks by and wants to chat. I want to get more information on the Infinity Fabric, the Neural Net Predictor and chipset integration.
170 Comments
View All Comments
Samus - Tuesday, December 13, 2016 - link
That's a good point on TDP. Even your i7-920 being 130w TDP generally idles around 20w and wont use much more doing basic media playback. The platform (x58) however, is a power hog no matter what the situation, a key factor people generally ignore. AMD platforms based on FM sockets have been power whores since their introduction (AM moved most of the platform functions onto the CPU where things are more efficient) and regarding AM, up until AM3, there were a lot of auxiliary chips (non-native USB 3.0, for instance) that used power.I'm excited for Zen. Mostly because it's been a decade since AMD had anything really competitive in performance per watt. I know it doesn't matter to some people but it matters to me, because as you said, less heat, less noise, higher threshold for oc.
Nagorak - Wednesday, December 14, 2016 - link
Yeah, there's this strange concept called shutting down when not in use. It'd probably safe you big on your power bill, increase the longevity of your components and pollute the environment less too. A modern OS with an SSD can boot up in less than 30 seconds soon there's not much downside.bigboxes - Wednesday, December 14, 2016 - link
Shutting down your computer and then starting it back up does not increase the longevity of your computer. Now, if you only use your pc to game once in a while then sure turn the thing off. I work from home and my main rig is doing something all the time. It has an SSD, but also 4xHDDs that I don't want to constantly spin up and spin down. Haswell is fairly energy efficient and there are all lower power states that it goes into during the rare times I'm not crunching something.The file server... well, that needs to be up all the time. I have multiple devices that hit that all the time. 3 TVs, 3 desktops, 3 tablets, 3 smartphones and remotely from when I'm on the go. Not to mention anyone who hits my FTP server. It's got an SSD for the OS drive and 8 spinners to store all those gee bees.
My HTPC gets put into hibernation when not in use. It pretty much sips on the juice and then wakes up in less than 30 seconds as it loads everything to a state I last left it in. But not turned off. SSD only.
My wife's desktop gets turned off whenever she's not using it or we go to bed (in bedroom). SSD only.
For all the holier-than-thou posters, my electrical bill was $83 last month. If you'd actually read my post instead of had a knee-jerk reaction you'd see where I said that energy savings was important. I just don't come into your home and tell you what to do, not knowing your usage.
Laststop311 - Friday, December 23, 2016 - link
It actually lowers the longevity to turn it on and off everyday. All the silicon chips do better at a fixed temp instead of being cold then hot over and over. HDD's have to park the reading heads and unpark. The PC i'm using now hasn't been off for more than a couple hours in at least 4 years and it's 7 years old i7-980x + fiji pro and not a single hardware failure at all. I did a single GPU upgrade from a radeon 5870 to a fiji pro and on the fiji pro i was able to unlock some of the stream processors. It's within a few fps of a full fiji x. If you use your PC everyday just leave it on.Demiurge - Monday, December 19, 2016 - link
I agree with you in theory, but in practice, considering that you are essentially talking about the power of 2 versus 3 60W light bulbs -- you probably are already wasting the energy elsewhere. Besides, ever heard of sleep modes if you are that concerned about energy... sheesh!TheJian - Tuesday, December 13, 2016 - link
If this chip can really match a 6900k at 45w less they should charge $1000 for it period. They are not in business to be your friend, and it's high time they make some CASH while they can. Intel won't take forever to respond. Intel shareholders won't want to see a price war today, they want profits high after losing 4.1B a year on mobile for ages. They won't want those gains that come from quitting that crap to go away again due to a price war. Intel can't afford to take a huge stock price hit at this time either. They are about to lose the fab wars and need all the cash they can get. That is why this is the perfect time to attack for AMD.Pricing your chips stupid will get them killed. There are people already paying for Intel chips at current prices and they're actually selling a LOT of them. No point in low-balling if you have an equal performing chip at 2/3 watts. I'd probably sell it at price parity at all levels of Intel chips if this is the case. Just make sure you don't screw the low chip with less PCIE lanes and you're golden. People can still get in cheap then and upgrade later to top end stuff or rev2 if AMD makes AM4 last through 10nm starts. Win with stuff like that, not pricing. Price what the market will take and no less. Sure I want a cheap chip, but I also want AMD to make a few billion, get debt free and have a few billion left in the bank cash for the future.
If they price like you're saying they're idiots and management should be fired immediately. If you sell out the first batch, raise the price another $100. Clearly people would pay an extra $100 when Intel charges double that for the 40 lanes and that's on top of the board costs. So AMD would still be doing small favors then. If you're crunching all day on this (like I will be in handbrake etc) then that 50w bulb burning all day will add up over 5yrs of this things life also. The TCO is what I care about.
If Intel's next chip comes Q2, you have 3 months to make as much money as possible then cut prices if needed. Of course, if all Intel manages to do is drop watts to AMD's watts with their next rev, I still wouldn't drop prices until I had units stuck on shelves...LOL. AMD has lost billions (and everything they owned) in the last ~15-17yrs. Time to make some bank. It looks like Intel quads will be 95w, so no low pricing please AMD. Make money.
MobiusPizza - Tuesday, December 13, 2016 - link
Yes and no. AMD is not a charity, they need to make money. But, they also need to gain market share, and to do so they need to price low and sacrifice profitability.Manch - Wednesday, December 14, 2016 - link
Exactly. AMD needs to turn a profit thru volume, not per chip. AMD has earned themselves a stigma that they can only partially erase by bringing out a performing chip.They need to win mind share by offer the same for less. This will bring customers back into the fold. Then if they remain competitive, they can increase prices. When this chip does launch, if its good, hits parity with the competing Intel procs at a lower price, off feature parity, then they will win back customers. If its compete with a big ole * next to it....down teh drain they go.NesuD - Wednesday, December 14, 2016 - link
There will be a price war. Intel has not lowered their prices through 2 new process nodes in any significant way. They have lots of margin built into their pricing because of higher yields so they can drop prices to compete with amd pricing at almost any level and stay profitable.Manch - Thursday, December 15, 2016 - link
PCWorld has a good point regarding price.http://www.pcworld.com/article/3149101/components-...