The Vega Architecture: AMD’s Brightest Day

From an architectural standpoint, AMD’s engineers consider the Vega architecture to be their most sweeping architectural change in five years. And looking over everything that has been added to the architecture, it’s easy to see why. In terms of core graphics/compute features, Vega introduces more than any other iteration of GCN before it.

Speaking of GCN, before getting too deep here, it’s interesting to note that at least publicly, AMD is shying away from the Graphics Core Next name. GCN doesn’t appear anywhere in AMD’s whitepaper, while in programmers’ documents such as the shader ISA, the name is still present. But at least for the purposes of public discussion, rather than using the term GCN 5, AMD is consistently calling it the Vega architecture. Though make no mistake, this is still very much GCN, so AMD’s basic GPU execution model remains.

So what does Vega bring to the table? Back in January we got what has turned out to be a fairly extensive high-level overview of Vega’s main architectural improvements. In a nutshell, Vega is:

  • Higher clocks
  • Double rate FP16 math (Rapid Packed Math)
  • HBM2
  • New memory page management for the high-bandwidth cache controller
  • Tiled rasterization (Draw Stream Binning Rasterizer)
  • Increased ROP efficiency via L2 cache
  • Improved geometry engine
  • Primitive shading for even faster triangle culling
  • Direct3D feature level 12_1 graphics features
  • Improved display controllers

The interesting thing is that even with this significant number of changes, the Vega ISA is not a complete departure from the GCN4 ISA. AMD has added a number of new instructions – mostly for FP16 operations – along with some additional instructions that they expect to improve performance for video processing and some 8-bit integer operations, but nothing that radically upends Vega from earlier ISAs. So in terms of compute, Vega is still very comparable to Polaris and Fiji in terms of how data moves through the GPU.

Consequently, the burning question I think many will ask is if the effective compute IPC is significantly higher than Fiji, and the answer is no. AMD has actually taken significant pains to keep the throughput latency of a CU at 4 cycles (4 stages deep), however strictly speaking, existing code isn’t going to run any faster on Vega than earlier architectures. In order to wring the most out of Vega’s new CUs, you need to take advantage of the new compute features. Note that this doesn’t mean that compilers can’t take advantage of them on their own, but especially with the datatype matters, it’s important that code be designed for lower precision datatypes to begin with.

Vega 10: Fiji of the Stars Rapid Packed Math: Fast FP16 Comes to Consumer Cards
Comments Locked

213 Comments

View All Comments

  • rtho782 - Monday, August 14, 2017 - link

    First? lol
  • FireSnake - Monday, August 14, 2017 - link

    Good! Now, let us read this in peace :)
  • coolhardware - Monday, August 14, 2017 - link

    Exactly. I am VERY excited to read about this, especially since AMD has been dragging this launch out for what seems forver.

    While reading I will also have another window open furiously refreshing http://amzn.to/2hZ9iPb (shortened URL for direct amd vega search on Amazon!) to see when they come in stock, and if we can get one before they sell out! ;-)

    WOW, just checked and NewEgg is already out of EVERY Vega SKU :-( Like 15 different models from various brands :-( Bummer and I bet 80% are miners!
  • coolhardware - Monday, August 14, 2017 - link

    BestBuy sold out of all of their SKUs as well. :-(
  • Targon - Monday, August 14, 2017 - link

    I ran into the Out of Stock, auto-notify on Newegg for hours....and suddenly one showed up that I could actually buy. So, I hit it, and it has been in packaging for the past five hours. Amazon really messed up with the Ryzen launch, allowing far more orders than the expected number of Ryzen 7 chips, to the point where it took several additional weeks before some of them shipped out. That is why I won't order a highly anticipated item from Amazon.
  • Manch - Tuesday, August 15, 2017 - link

    I ordered the Oculus package, the $399 one from Amazon on July 12th. They shipped the controllers two days ago. headset is out of stock until further notice. It was in stock when I ordered. Then it was all orders before July 15th will be filled first. Then it was the touch controllers are out of stock. Then the touch controllers ship but the headset is out of stock. Aggravating to say the least. They are one of the few that ships electronics to APO without being shitty about it or charging triple of actual costs.
  • coolhardware - Tuesday, August 15, 2017 - link

    Way to stick with it! Did Best buy complete your order? Fingers crossed for you :-)
  • rtho782 - Monday, August 14, 2017 - link

    I think the GTA5 1440p benchmarks and the BF1 load power consumption graphs made me laugh the most.

    I guess it's a pretty effective space heater. Maybe they want to discourage crypto mining by using more power to make it unprofitable.

    It's a shame, we need more competition. *sigh*
  • Ratman6161 - Monday, August 14, 2017 - link

    295 watts..?!?!?! Currently my whole system only pulls about 225 watts even when torture testing. That testing is only including CPU and RAM but other articles say my RX460 is about 104 watts during torture testing. So if I was stress testing CPU, RAM and video card all at once I'd be at around 329. Not a gamer myself but its hard for me to imagine over 500 watts for my system. Just doesn't make any sense in this day and age.
  • Kratos86 - Monday, August 14, 2017 - link

    Hmm you either don't understand how crypto mining works or what a joke is. Cryptominers generally turn the GPU clock down because it isn't very useful in these situations, even bandwidth isn't as relevant as latency. These cards with a bit of tweaking are getting 35 mh/s at $35 for $500. The Vega 56 blows the 64 away but both GPU's beat the RX 580 in terms of bang for buck and that's considering they haven't been optimised for mining performance yet.

    If these things hit 40 at $500 a piece, two for $1000, thats 80 mh/s for less than a Titan XP which at a cost of $1370 does around 37 mh/s. Saving $50 a year on power consumption and paying double the price for that privilege is not a very intelligent way to do things.

    Suffice to say if you want one of these at the prices they are supposed to be selling at, you might get lucky and find one sometime this year because you are not finding these GPU's at these prices anytime soon and thats if they aren't sold out at any price. Unless AMD do something to get this in stock and keep it in stock the next few months are going to suck if you want one of these at prices that aren't inflated.

    I guess AMD could have worst problems than "cryptominers keep buying our GPUs faster than we can make them" but it's still a situation they need to remedy.

Log in

Don't have an account? Sign up now