Migrating from Broadwell to Skylake-SP:

More Cores, More Memory, Cache, AVX-512

The way that Intel is approaching the Xeon D product family is changing. Previously, the Broadwell-based Xeon parts were compact, with HEDT level core counts and a nice feature set. For this generation, Intel has decided to migrate to the server Skylake-SP core, rather than the standard Skylake-S core. Along with the generational enhancements over Broadwell, this means an adjusted cache hierarchy, use of Intel’s new core-to-core mesh technology, and the addition of AVX-512 units. This means that the new Xeon D-2100 series is, in silicon, a big 18-core Skylake-SP behemoth. We have reached out to Intel to confirm if this is the case for all the cores in the stack, as well as die sizes and transistor counts. Stay tuned for that information.

As we reported in depth in our analysis of the Skylake-SP core, the implementation of the cache, mesh, and AVX-512 changes, has a significant impact on how software has to consider running on these cores when it comes to memory accesses and core-to-core communication. Here’s a small brief primer on the changes:

Migrating from Broadwell to Skylake-SP
  D-1500 D-2100 Overall
Cache L2 256 KB L2 1 MB L2 Rule-of-thumb 2x hit-rate in L2

Overall L3 not as useful
L3 1.5 MB L3/core 1.375 MB L3/core
  L3 inclusive of L2 L3 is victim cache
Mesh / Uncore Ring bus Each node
(PCIe / core / DRAM)
has crossbar partition
for x/y routing
- Better scaling
- Better av. node-to-node latency

- More complex 
- More power at same frequency
AVX-512 AVX
AVX 2.0
AVX512F
AVX512CD
AVX512VW
AVX512DQ
AVX512VL
Substantial vector performance improvement

Increase in peak TDP
and silicon die area

A note on the AVX-512 support: Intel’s current consumer and enterprise-based Skylake-SP processors offer two different variants for this. Some processors, most notably the cheaper ones, only have one 512-bit FMA (fused multiply-add) execution unit for AVX-512 support, while the bigger and more expensive processors have two FMA units. The benefit of having two FMA units in action allows the AVX-512 silicon to be fed, and increases throughput, at the potential expense of power and longevity. For Xeon D-2100, Intel has stated that all of the processors only have a single 512-bit FMA unit.

Also on the mesh: compared to the old ring bus methodology, from our tests on the consumer line, it is clear that Intel is running the mesh (or the ‘uncore’) at a lower frequency overall, which may cause a drop in core-to-core bandwidth in the newer processors. We have asked Intel to confirm what mesh frequency is being used in the D-2100 series, and we are waiting to see if that information will be disclosed. It is worth noting that when we get access to these parts, we can probe for the frequency very easily, so it would help if Intel officially disclosed the value.

With the core migration comes other feature changes. The new Xeon D-2100 will be rated for quadruple the memory capacity of the Xeon D-1500, as the number of memory channels doubles from two to four, and RDIMM/LRDIMM modules up to 64 GB each are supported. This means a single processor can now support 512GB using eight 64GB RDIMMs. The previous generation only supported 32 GB RDIMMs.

 

Up to 32 PCIe 3.0 Lanes, 20 HSIO Lanes, Storage and VROC

For this generation, Intel has kept the number of publicly available PCIe lanes for add-in controllers at 32, which means we are likely to see implementations with x16/x8 PCIe lanes, but also opens up opportunities in cold/warm storage for more RAID cards, or in communications for additional transceivers or accelerators. When we say ‘publicly’ available here, it is clear that the chip has more than the previous version, the presence of QuickAssist means that there is likely at least 48 as part of the design (or 64 if the silicon is identical to the Xeon SP XCC chip design), but due to product segmentation/items like QAT, the amount of lanes for other controllers is kept constant. To a certain extent, this allows Intel to offer the D-2100 series almost as a drop in replacement for those that want to upgrade to Skylake cores.


The Lewisberg Chipset with 26 HSIO lanes, found in Xeon SP

As Xeon D is marketed as a system-on-chip, the traditional chipset is integrated into the platform. Intel has integrated one of its latest series of chipsets, and is offering 20 PCIe 3.0 High-Speed IO (HSIO) lanes for this. As with the chipsets, there will be limits as to where the lanes can go: these are typically limited to a PCIe 3.0 x4 connection at most, and some network controllers are limited to certain HSIO slots, but it does allow for intricate systems to be built.

One of the benefits of the number of PCIe lanes, as well as PCIe switch support, is for storage. Intel is targeting both long-term backup (cold storage) and content delivery networks/CDNs (warm storage) with this product line, and so Intel is keen to promote its PCIe storage and NVMe support. We confirmed that the Xeon D-2100 will be supporting Intel’s Virtual Raid on Chip (VROC), which means that hardware-based RAID 0 and RAID 1 configurations will have additional support benefits, but will be limited to specific NVMe drives and require a hardware based VROC key provided by the OEM. Intel also states that Xeon D-2100 will have fourteen SATA ports integrated, an increase from six SATA ports on the previous generation, although Intel has not disclosed how many AHCI controllers this is, or the SATA RAID support for these controllers on the platform. The platform also supports some legacy IO: eSPI, LPC and SMBus

Either way we slice it, Xeon D-2100 is coming across as a Skylake-SP HCC core and a Lewisburg chipset either melded into one, or two chips on the same package. Lewisburg has options available for different levels of QuickAssist and 10GbE, just as the Xeon D-2100 series. In order to get QAT and 10GbE, the Xeon SP platform has to provide 16 PCIe lanes from the CPU to the chipset for bandwidth - we know the Xeon SP HCC core has 64 PCIe lanes total, so if 16 each are used for QAT/10GbE, that leaves 32 in play. Which is what Xeon D-2100 has. Lewisberg also supports the same IO: 14 SATA ports. It wouldn't make sense for Intel to create a completely new silicon die just for Xeon D, right? If not, that makes Xeon D a multi-chip package with Xeon Gold and Lewisburg.

Increasing Ethernet and Adding QuickAssist Enterprise Features and Availability
POST A COMMENT

22 Comments

View All Comments

  • wow&wow - Friday, February 9, 2018 - link

    Does Intel Xeon D-2100 need the OS kernel relocation? Are OS patches, if needed, ready for it? Reply
  • acersupport - Thursday, February 15, 2018 - link

    Hello, this very knowledgeable post, I am very happy after reading this article. truly honest I love this post. Because I get more information one Xeon D-2100 motherboard. A part of every person call with Intel. I glad for sharing this post. We are also providing an Acer tech support. We are twenty-four into seven services provide Acer tech support.
    http://customer-support-service.weebly.com/acer-su...
    Reply

Log in

Don't have an account? Sign up now