The Intel Xeon E5 v4 Review: Testing Broadwell-EP With Demanding Server Workloads
by Johan De Gelas on March 31, 2016 12:30 PM EST- Posted in
- CPUs
- Intel
- Xeon
- Enterprise
- Enterprise CPUs
- Broadwell
A Modest Tick
As Broadwell is a tick - a die shrink of an existing architecture, rather than a new architecture - so you should expect modest IPC improvements. Most Xeon E5 v4 SKUs have slightly lower clockspeeds compared to their Haswell v3 brethren, so overall the single threaded performance has hardly improved. Clock for clock, Intel tells us that their simulation tools show that Broadwell delivers about 5% better performance per clock in non-AVX2 traces.
First Y-axis + bars: simulated single threaded performance improvement. Blue line + second Y-axis is the cumulative improvement.
In that sense, Broadwell is basically a Haswell made on Intel's 14nm second generation tri-gate transistor process. Intel did make a few subtle improvements to the micro-architecture:
- Faster divider: lower latency & higher throughput
- AVX multiply latency has decreased from 5 to 3
- Bigger TLB (1.5k vs 1k entries)
- Slightly improved branch prediction (as always)
- Larger scheduler (64 vs 60)
None of these improvements will yield large performance improvements. The larger improvements must come from other features.
New Features
Compared to Haswell-EP, Broadwell-EP also includes some new features. The first one is the improved power control unit.
On Haswell, one AVX instruction on one core forced all cores on the same socket to slow down their clockspeed by around 2 to 4 speed bins (-200,-400 MHz) for at least 1 ms, as AVX has a higher power requirement that reduces how much a CPU can turbo. On Broadwell, only the cores that run AVX code will be reducing their clockspeed, allowing the other cores to run at higher speeds.
The other performance feature is the vastly improved PCLMULQDQ (carry-less multiplication) instruction: throughput has been doubled, and latency reduced from 7 cycles to 5.
This increases AES (symmetric) encryption performance by 20-25%, and CRCs (Cyclic Redundancy check) are up to 90% faster. Broadwell also has some new ADCX/ADOX instructions to speed up asymmetric encryption algorithms such as the popular RSA. These improvements are implemented in OpenSSL 1.0.2-beta3. But don't expect too much from it.. The compute intensive asymetric encryption is mostly used to initiate a secure connection. Most modern web applications keep their sessions "alive", and as a result, events that require asymmetric encryption happen a lot less frequentely . Symmetric encryption (like AES) which is used to send encrypted data is a lot lighter, so even on a fully encrypted website with long encrypted data streams, encryption is only a small percentage (<5%) of the total computing load.
112 Comments
View All Comments
jhh - Thursday, March 31, 2016 - link
The article says TSX-NI is supported on the E5, but if one looks at Intel ARK, it say it's not. Do the processors say they support TSX-NI? Or is this another one of the things which will be left for the E7?JohanAnandtech - Friday, April 1, 2016 - link
Intel's official slides say: "supports TSX". All SKUs, no exceptions.Oxford Guy - Thursday, March 31, 2016 - link
Bigger, badder, still obsolete cores.patrickjp93 - Friday, April 1, 2016 - link
Obsolete? Troll.Oxford Guy - Tuesday, April 5, 2016 - link
Unlike you, propagandist, I know what Skylake is.benzosaurus - Thursday, March 31, 2016 - link
"You can replace a dual Xeon 5680 with one Xeon E5-2699 v4 and almost double your performance while halving the CPU power consumption."I mean you can, but you can buy 4 X5680s for a quarter the price of a single E5-2699v4. It takes a lot of power savings to make that worthwhile. The pricing in the server market's always seemed weirdly non-linear to me.
warreo - Friday, April 1, 2016 - link
Presumably, it's not just about TCO. Space is at a premium in a datacenter, and so being able to fit more performance per sq ft also warrants a higher price, just like how notebook parts have historically been more expensive than their desktop equivalents.ShieTar - Friday, April 1, 2016 - link
But you don't get 4 1366-Systems for the price of one 2011-3 System. Depending on your Memory, Storage and Interconnect Needs, even two full Systems based on the Xeon 5680 may cost you more than one system based on the E5-2699 v4. One less Infiniband-Adapter can easily save you 500$ in Hardware.And you are not only halving the CPU power consumption, but also the power consumption of the rest of the system that you no longer use, so instead of 140W you are saving probably at least 200W per System, which can already add up to more than 1k$ in electricity and cooling bills for a 24/7 machine running for 3 years.
And last, but by no means least, less parts means less space, less chance for failure, less maintenance effort. If you happily waste a few hours here or there to maintain your own workstation, you don't do the math, but if you have to pay somebody to do it, salaries matter quickly. With an MTBF for an entire server rarely being much higher than 40.000, and recovery/repair easily taking you a person-day of work, each system generates about 1.7 hours of work per year. Cost of work (it's more than salaries, of course) probably comes up to 100$ for a skilled technical administrator, thus producing another 500$ over 3 years of added operational cost.
And of course, space matters as well. If your data center is filled, it can be more cost effective to replace the old CPUs with new expensive ones, rather than build a new facility to fill with more old Systems.
If you add it all up, I doubt you can get a System with an Xeon 5680 and operate it over 3 years for anything below 20.000$. So going from two 20.000$-Systems to a single 24.000$ Dollar System (because of an extra 4000$ for the big CPU) should save you a lot of money in the long run.
JohanAnandtech - Friday, April 1, 2016 - link
Where do you get your pricing info from? I can not imagine that server vendors still sell X5680s.extide - Friday, April 1, 2016 - link
Yeah, if you go used. No enterprise sysadmin worth his salt is ever going to put used gear that is not in warranty, and in support into production.