Early AMD Zen Server CPU and Motherboard Details: Codename ‘Naples’, 32-cores, Dual Socket Platforms, Q2 2017by Ian Cutress on August 18, 2016 11:40 PM EST
- Posted in
- Enterprise CPUs
At the AMD Zen microarchitecture announcement event yesterday, the lid was lifted on some of the details of AMD’s server platform. The 32-core CPU, codename Naples, will feature simultaneous multithreading similar to the desktop platform we wrote about earlier, allowing for 64 threads per processor. Thus, in a dual socket system, up to 128 threads will be available. These development systems are currently in the hands of select AMD partners for qualification and development.
AMD was clear that we will expect to hear more over the coming months (SuperComputing 2016 is in November 2016, International SuperComputing is in June 2017) with a current schedule to start providing servers in Q2 2017.
Analysing AMD’s 2P Motherboard
AMD showed off a dual socket development motherboard, with two large AMD sockets using eight phase power for each socket as well as eight DDR4 memory slots.
It was not stated if the CPUs supported quad-channel memory at two DIMMs per channel or eight channel memory at this time, and there’s nothing written on the motherboard to indicate which is the case – typically the second DIMM slot in a 2DPC environment is a different color, which would suggest that this is an eight-channel design, however that is not always the case as some motherboard designs use the same color anyway.
However, it is worth noting that each bank of four memory slots on each side of each CPU has four chokes and four heatsinks (probably VRMs) in two sets. Typically we see one per channel (or one per solution), but the fact that each socket seems to have eight VRMs for the memory would also lean into the eight-channel idea. To top it off, each socket has a black EPS 12V (most likely for the CPU), which is isolated and clearly for CPU power, but also a transparent EPS 12V and a transparent 6-pin PCIe connector. These transparent connectors are not as isolated, so are not for low power implementation, but each socket does have one attached, perhaps suggesting that the memory interfaces are powered independently to the CPU. More memory channels would require more power, and four-channel interfaces have been done and dusted before via the single EPS 12V, so requiring even more power raises questions. I have had word in my ear that this may be as a result of support for future high energy memory, such as NVDIMM, although I have not been able to confirm this.
Edit: The transparent EPS 12V could be a PCIe 8-pin in retrospect, but still seems excessive for the power it can provide.
Unfortunately, we could not remove the heatsinks to see the CPUs or the socket, but chances are this demo system would not have CPUs equipped in the first place. Doing some basic math based on the length of a DDR4 module, our calculations show that the socket area (as delineated by the white line beyond the socket) is 7.46 cm x 11.877 cm, to give an area of 88.59 cm2. By comparison, the heatsink has an active fin floor plan area of 62.6 cm2 based on what we can measure. Unfortuantely this gives us no indication of package area or die area, both of which would be more exciting numbers to have.
Putting the CPU, memory and sockets aside, the motherboard has a number of features worth pointing out. There is no obvious chipset or southbridge in play here. Where we would normally expect a chipset, we have a Xilinx Spartan FPGA without a heatsink, although I would doubt this is the chipset based on the fact that there is an ‘FPGA Button’ right above it and this is most likely to aid in some of the debugging elements on the system.
Further to this, the storage options for the motherboard are all located on the left hand side (as seen) right next to one of the CPUs. Eight SATA style ports are here, all in blue which usually indicates that these are part of the same head controller, but also part of the text on the motherboard states ‘ALL SATA CONNS CONNECTED TO P1’ which indicates the first processor (from the main image, left to right, athough P1 is actually the 'second processor') has direct control.
Other typical IO on the rear panel such as a 10/100 network port (for the management) and the USB 3.0 ports are next to the second processor, which might indicate that this processor has IO control over these parts of the system. However the onboard management control, provided by an ASpeed AST2500 controller with access to Elpida memory, is nearer the PCIe slots and the Xilinx FPGA.
The lack of an obvious chipset, and the location of the SATA ports, would point to Naples having the southbridge integrated on die, and creating an SoC rather than a pure CPU. Bringing this on die, to 14nm FinFET, will allow the functions to be in a lower power process (historically chipsets are created at a larger lithography node to the CPU) as well as adjustments in bandwidth and utility, although at the expense of modularity and die area. If Naples has an integrated chipset, it makes some of the findings on the AM4 platform we saw at the show very interesting. Either that or the FPGA is actually used for the developers to change southbridge operation on the fly (or that chipsets are actually becoming more like FPGAs, which is more realistic as chipsets move to PCIe switch mechanisms).
There are a lot of headers and jumpers on board which won’t be of much interest to anyone except the platform testing, but the PCIe layout needs a look. On this board we have four PCIe slots below one of the CPUs, each using a 16 lane PCIe slot. By careful inspection of the pins we can certainly tell that the slots are each x16 electrical.
However the highlighted box gives some insight into the PCIe lane allocation. The text says:
“Slot 3 has X15 PCIe lanes if MGMT PCIe Connected
Slot 3 has X16 PCIe lanes if MGMT PCIe Disconnected”
This would indicate that slot three has a full x16 lane connection for data, or in effect we have 64 lanes of PCIe bandwidth in the PCIe slots. That’s about as far as we can determine here – we have seen motherboards in the past that take PCIe lanes from both CPUs, so at best we can say that in this configuration that the Naples CPU has between 32 lanes and 64 lanes for a dual processor system. The board traces, as far as we were able to look at the motherboard, did not make this clear, especially when this is a multi-layer motherboard (qualification samples are typically over-engineered anyway). There is an outside chance that the integrated southbridge/IO is able to supply an x16 combination PCIe lane, however there is no obvious way to determine if this is the case (and is not something we’ve seen historically).
AM4 Desktop Motherboards
Elsewhere on display for Zen, we also saw some of the internal AM4 motherboards in the base units at the event.
These were not typical motherboard manufacturer boards from the usual names like ASUS or GIGABYTE, and were very clearly internal use products. We weren’t able to open up the cases to see the boards better, but on closer inspection we saw a number of things.
First, there were two different models of motherboards on show, both ATX but varying a little in the functionality. One of the boards had twelve SATA ports, some of which were in very odd locations and colors, but we were unable to determine if any controllers were on board.
Second, each of the boards had video outputs. This would be because we already know that the AM4 platform has to cater for both Bristol Ridge and Summit Ridge, with the former being APU based with integrated graphics and the updated Excavator v2 core design. On one of the motherboards we saw two HDMI outputs and a DisplayPort output, suggesting a full 3-digital display pipeline for Bristol Ridge.
The motherboards were running 2x8GB of Micron memory, running at DDR4-2400. Also, the CPU coolers – AMD was using both its 125W AMD Wraith cooler as well as the new 95W near silent cooler between all four/five systems on display. This pegs these engineering samples at a top end of this TDP, but if recent APU and FX product announcements are anything to go by, AMD is happy to put a 125W cooler on a 95W CPU, or a 95W cooler on a 65W CPU if required.
I will say one thing that has me confused a little. AMD has been very quiet on the chipset support for AM4, and what IO the south bridge will have on the new platform (and if that changes if a Bristol or Summit Ridge CPU is in play at the time). In the server platform, we concluded above that the chipset is likely integrated into the CPU – if that is true on the consumer platform as well, then I would point to the chipset-looking device on these motherboards and start asking questions. Typically the chipset on a motherboard is cooled by a passive heatsink, but these chips here had low z-height on fans them and were running at quite the rate. I wonder if they were like this so when the engineers use the motherboards it means there is more space to plug testing tools, or if it for another purpose entirely. As expected, AMD said to expect more information closer to launch.
To anyone who says motherboards are boring, well I think AMD has given a number of potential aspects of the platform away in merely showing a pair of these products for server and desktop. Sure, they answer some questions and cause a lot more of my hair to fall out trying to answer the questions that arise, but at this point it means we can start to have a fuller understanding of what is going on beyond the CPU.
As for server based Zen, Naples, depending on PCIe counts and memory support, along with the cache hierarchy we discussed in the previous piece, the prospect of it playing an active spot in enterprise seems very real. Unfortunately, it is still a year away from launch. There are lots of questions about how the server parts will be different, and how the 32-cores on the SKUs that were talked about will be arranged in order to shuffle memory around at a reasonable rate – one of the problems with large core count parts is being able to feed the beast. AMD even used that term in their presentation, meaning that it’s clearly a topic they believe they have addressed.
Post Your CommentPlease log in or sign up to comment.
View All Comments
jhh - Friday, August 19, 2016 - linkOne question is if they have included many of the features Intel has added for improved virtualization support, such as direct DMA into cache (DPDK support), improved end-of-interrupt processing in VM without significant work required in the host OS, and other similar changes.
slickr - Friday, August 19, 2016 - linkWe don't know yet, I suspect we'll find out in late 2016, early 2017.
Einy0 - Friday, August 19, 2016 - linkI was wondering the same thing. This thing could be a beast for VMs. I certainly hope they have worked on their Virtualization extensions. Virtualization would seemingly be the largest usage application to take advantage of this kind of core/thread density.
nunya112 - Friday, August 19, 2016 - linkin regards to chipset support. its all on the CPU no SB will be needed unless it interfaces with SATA or external USB controllers. hence why they are having trouble with the track lengths on the motherboards
SunLord - Friday, August 19, 2016 - linkHas anyone seen any rumors for how many pins the new server socket has?
The_Assimilator - Friday, August 19, 2016 - linkInteresting layout of the IO panel on that prototype board - in particular, the fact that the audio outputs seem to be located closest to the CPU socket, whereas on current boards the trend is to put them at the other end for isolation purposes. I wonder if there's some special sauce there, or if it's just done this way for prototyping purposes.
ThortonBe - Friday, August 19, 2016 - linkThe article says, "with access to Elpida memory", but didn't Micron buy Elpida in 2013?
mdw9604 - Friday, August 19, 2016 - linkUntil we get some clarity around what their performance/watt is with Zen, which as been AMD/IBM's Achilles heel and their problem with getting any traction in the server/HPC market in the past 10 years, this whole Zen buildup is a waste of everybody's time.
Michael Bay - Saturday, August 20, 2016 - linkPerformance was much more important as a factor; then again, those two things are closely connected.
TDP-wise, look at how 480 is doing. It will be better due to the shrink, but not amazingly better.
mdw9604 - Saturday, August 20, 2016 - linkIn data centers where core counts can hit the millions, cycles/watt is a big deal. Power cost are massive and will make or break chip makers chances. If they can't get close to current Xeon's they will be marginalized in niche area. I am pulling for them. We need a break in Intel's monopoly on the highend x86 market.