NVIDIA’s CEO Jensen Huang has over the years become increasingly known for his giveaway antics at AI conferences. In recent years the CEO has unveiled both the NVIDIA Titan X (Pascal) and the NVIDIA Titan V in this fashion. And now you can add one more reveal to this list, as last evening Huang gave out 20 units of a new Titan V SKU, the Titan V CEO Edition, at the Computer Vision and Pattern Recognition conference in Salt Lake City.

According to NVIDIA, the aptly named SKU is apparently a “limited edition” product, and unlike past Huang reveals, NVIDIA has not sent out any announcements of a new product. So for the moment, this is not a retail product and is not immediately expected to become one. None the less, this is an unusual development as the new Titan V SKU is not simply a Titan V with additional memory, but rather has some notable configuration differences that set it apart from the regular Titan V.

NVIDIA Compute Accelerator Specification Comparison
  Titan V
CEO Edition
Titan V Tesla V100
(PCIe)
Titan Xp
CUDA Cores 5120? 5120 5120 3840
Tensor Cores 640? 640 640 N/A
ROPs 128 96 128 96
Core Clock 1200MHz? 1200MHz ? 1485MHz
Boost Clock 1455MHz? 1455MHz 1370MHz 1582MHz
Memory Clock 1.7Gbps HBM2? 1.7Gbps HBM2 1.75Gbps HBM2 11.4Gbps GDDR5X
Memory Bus Width 4096-bit 3072-bit 4096-bit 384-bit
Memory Bandwidth 900GB/sec? 653GB/sec 900GB/sec 547GB/sec
VRAM 32GB 12GB 16GB 12GB
L2 Cache 6MB 4.5MB 6MB 3MB
Single Precision 13.8 TFLOPS 13.8 TFLOPS 14 TFLOPS 12.1 TFLOPS
Double Precision 6.9 TFLOPS
(1/2 rate)
6.9 TFLOPS
(1/2 rate)
7 TFLOPS
(1/2 rate)
0.38 TFLOPS
(1/32 rate)
Tensor Performance
(Deep Learning)
125 TFLOPS 110 TFLOPS 112 TFLOPS N/A
GPU GV100
(815mm2)
GV100
(815mm2)
GV100
(815mm2)
GP102
(471mm2)
Transistor Count 21.1B 21.1B 21.1B 12B
TDP 250W? 250W 250W 250W
Form Factor PCIe PCIe PCIe PCIe
Cooling Active Active Passive Active
Manufacturing Process TSMC 12nm FFN TSMC 12nm FFN TSMC 12nm FFN TSMC 16nm FinFET
Architecture Volta Volta Volta Pascal
Launch Date 6/20/2018 12/07/2017 Q3'17 04/07/2017
Price N/A $2999 ~$10000 $1299

Because this isn’t a retail SKU – at least not yet – NVIDIA hasn’t published official specifications for the card, so most of our table above is pending confirmation. However based solely on the 32GB VRAM capacity, we can accurately infer two very important points.

  1. NVIDIA is using new 8-Hi HBM2 memory stacks, as with their 32GB Tesla cards
  2. Titan V CEO Edition has all 4 of its ROP/Memory Controller partitions enabled, up from 3 on the retail Titan V

It’s the latter point in particular that has some potentially significant ramifications for NVIDIA’s limited edition Titan V SKU. The standard Titan V itself is a salvage part with only 3 ROP/MC partitions enabled; consequently it only has 3/4ths of the memory bandwidth, pixel throughput, and L2 cache of its fully-enabled sibling. This has helped to differentiate the relatively cheap Titan V from the more expensive Tesla V100, with NVIDIA being able to leverage the memory capacity and memory bandwidth differences to ensure their flagship card remains attractive.

The end result is that the Titan V CEO Edition is not just a Titan V with more memory. In fact memory capacity aside, thanks to these changes there will almost certainly be meaningful (though not necessarily large) performance differences between it and the regular Titan V in any kind of memory bandwidth-bound scenario. And from I’ve heard from Titan V users over the past year, bandwidth-bound scenarios are more common than one might think, as the regular Titan V can fully saturate its memory bandwidth on compute alone and still come up short. Equally important, this means that at least on paper, there’s not much separating the new SKU from the 32GB Tesla V100 in terms of performance.

As an added wrinkle, of the handful of specifications that NVIDIA’s blog post does cover, they list the new card as offering 125 TFLOPS of tensor core performance, whereas the retail Titan V is 110 TFLOPS. It’s not clear how NVIDIA gets this number, but importantly, it means that there may be further clockspeed or SM configuration changes that have yet to be revealed by NVIDIA.

In any case, for the time being the only way to get this unexpected Titan V SKU is to get one of the 20 winners from NVIDIA’s giveaway to part with one. So the immediate impact to NVIDIA’s business – or to potential Titan buyers – is negligible. However given the fact that this is not just a Titan V with more memory, it does strike me as unusual that NVIDIA would produce a small batch of cards and then just stop, as someone just created a fair bit of extra work for NVIDIA driver & validation teams. So I wouldn’t at all be surprised if we see a similar SKU hit retail down the line, especially as the Titan V is the only remaining commercial GV100 product that doesn’t have a second, higher memory capacity configuration.

Source: NVIDIA (via SH SOTN)

Comments Locked

38 Comments

View All Comments

  • PeachNCream - Thursday, June 21, 2018 - link

    Can't you see that people here are talking about something more important (leather packaging) and no one else cares about the number of TFLOPS some random computer part spits out? Shoo with your specifications! Shoo, I say!
  • Alexvrb - Thursday, June 21, 2018 - link

    They haven't actually published most of the specs so I think jabbadap's explanation makes the most sense... if the tensor number is accurate, the other compute numbers probably aren't.
  • jabbadap - Thursday, June 21, 2018 - link

    SXM2 version of V100 has 125TFlops of tensor power. So maybe it has same clocks and same tdp as sxm2 V100. So real specs would be 7.8Tflops fp64, 15.7Tflops fp32 and 31.4Tflops fp16. And thus gpu clocks(boost maybe) would be 1.533GHz.
  • Bulat Ziganshin - Thursday, June 21, 2018 - link

    125/4=31.25 and so on, OTOH 31.4*4=125.6 so it may just a matter of rounding

    I just realized that 8 such GPUs has a nice 1 PFlop total speed, which is probably what they will market for their DGC-8.5 workstations
  • mode_13h - Friday, June 22, 2018 - link

    Because they're rated at 300 W, whereas the PCIe version is only 250 W.
  • Spunjji - Friday, June 22, 2018 - link

    That makes sense - set a 300W rating on the card and boom, more clock headroom. What's one more 8 pin connector among friends?
  • mode_13h - Friday, June 22, 2018 - link

    Yeah, the standard Titan V has a 6-pin + 8-pin and a 250 W rating.
  • peevee - Thursday, June 21, 2018 - link

    "The only realistic ways to get 125 TFLOPS is to increase clock speeds or increase amount of SMs"

    Not if they are memory-speed-limited in the previous design. Which might be, especially in the case of 8-bit encoded data where computation is essentially free (there is no need to actually add or multiply anything, just a direct read from a 64KB table with multiple read ports).
  • Bulat Ziganshin - Thursday, June 21, 2018 - link

    memory speed is completely separate topic. these TFLOPS are ALWAY computed from raw ALU power
  • Bulat Ziganshin - Thursday, June 21, 2018 - link

    and if you mean on-chip RAM (or rather register pool), it scales by frequency/amount with SMs, since it's part of SM itself

Log in

Don't have an account? Sign up now