The Vega Architecture: AMD’s Brightest Day

From an architectural standpoint, AMD’s engineers consider the Vega architecture to be their most sweeping architectural change in five years. And looking over everything that has been added to the architecture, it’s easy to see why. In terms of core graphics/compute features, Vega introduces more than any other iteration of GCN before it.

Speaking of GCN, before getting too deep here, it’s interesting to note that at least publicly, AMD is shying away from the Graphics Core Next name. GCN doesn’t appear anywhere in AMD’s whitepaper, while in programmers’ documents such as the shader ISA, the name is still present. But at least for the purposes of public discussion, rather than using the term GCN 5, AMD is consistently calling it the Vega architecture. Though make no mistake, this is still very much GCN, so AMD’s basic GPU execution model remains.

So what does Vega bring to the table? Back in January we got what has turned out to be a fairly extensive high-level overview of Vega’s main architectural improvements. In a nutshell, Vega is:

  • Higher clocks
  • Double rate FP16 math (Rapid Packed Math)
  • HBM2
  • New memory page management for the high-bandwidth cache controller
  • Tiled rasterization (Draw Stream Binning Rasterizer)
  • Increased ROP efficiency via L2 cache
  • Improved geometry engine
  • Primitive shading for even faster triangle culling
  • Direct3D feature level 12_1 graphics features
  • Improved display controllers

The interesting thing is that even with this significant number of changes, the Vega ISA is not a complete departure from the GCN4 ISA. AMD has added a number of new instructions – mostly for FP16 operations – along with some additional instructions that they expect to improve performance for video processing and some 8-bit integer operations, but nothing that radically upends Vega from earlier ISAs. So in terms of compute, Vega is still very comparable to Polaris and Fiji in terms of how data moves through the GPU.

Consequently, the burning question I think many will ask is if the effective compute IPC is significantly higher than Fiji, and the answer is no. AMD has actually taken significant pains to keep the throughput latency of a CU at 4 cycles (4 stages deep), however strictly speaking, existing code isn’t going to run any faster on Vega than earlier architectures. In order to wring the most out of Vega’s new CUs, you need to take advantage of the new compute features. Note that this doesn’t mean that compilers can’t take advantage of them on their own, but especially with the datatype matters, it’s important that code be designed for lower precision datatypes to begin with.

Vega 10: Fiji of the Stars Rapid Packed Math: Fast FP16 Comes to Consumer Cards
Comments Locked

213 Comments

View All Comments

  • FourEyedGeek - Tuesday, August 22, 2017 - link

    Your reflexes aren't fast enough
  • Aldaris - Monday, August 14, 2017 - link

    NV fanboy alert.

    Tell me, in what world did those results suggest to you it's slower?
  • Manch - Tuesday, August 15, 2017 - link

    ddriver calls them an Intel/Nvidia shill.
    Vladx calls them an AMD/Apple shill

    I think it was fair and balanced :D
  • sor - Monday, August 14, 2017 - link

    Performance wise it actually seems pretty good. People were worried it wouldn't even be able to compete with a 1080, but in many cases it slots between the 1080 and the Ti. The killer though is that power consumption. Burning 100+ more watts is insane. Otherwise, seems like it was a nice, competitive card.
  • blublub - Monday, August 14, 2017 - link

    This excessive power draw is, and many ppl forget that, node related.

    It's the same as with Ryzen:
    GloFo's 14nm is low power plus! Meaning it's very power efficient up to a certain frequency but once it surpasses it it drinks electricity like an elephant in steroids.

    It can be seen with Ryzen and Polaris, drop frequency and voltage and power goes down more than proportionally.

    AMD just didn't have enough money and was bound to GloFo so they couldn't take out different GPU sizes and on a different process
  • FreckledTrout - Monday, August 14, 2017 - link

    Yeah but they do have a shining light in that IBM bought 7nm process, its high frequency should really help both AMD's GPU and CPU's a lot.
  • Manch - Tuesday, August 15, 2017 - link

    You can't drink electricity. I get your point though.

    Make like a tree and get the out of here!
  • Yojimbo - Monday, August 14, 2017 - link

    It'll be interesting to see how much game developers take advantage of double rate FP16. Maybe there are some bottlenecks that can be alleviated without impacting quality much.
  • beck2050 - Monday, August 14, 2017 - link

    Over clocking seems very limited with that power draw. Custom 1080s are often 10 to 15% faster out of the box and still cooler and less power hungry.
    A bit disappointing.
  • mapesdhs - Monday, August 14, 2017 - link

    I mentioned that elsewhere, in the UK a 1080 with a 1759MHz base is 60 UKP cheaper than a Vega64/Air, and one can get a 1080 Ti for the price of a Vega64/Liquid.

Log in

Don't have an account? Sign up now