Segmented Memory Allocation in Software

So far we’ve talked about the hardware, and having finally explained the hardware basis of segmented memory we can begin to understand the role software plays, and how software allocates memory among the two segments.

From a low-level perspective, video memory management under Windows is the domain of the combination of the operating system and the video drivers. Strictly speaking Windows controls video memory management – this being one of the big changes of Windows Vista and the Windows Display Driver Model – while the video drivers get a significant amount of input in hinting at how things should be laid out.

Meanwhile from an application’s perspective all video memory and its address space is virtual. This means that applications are writing to their own private space, blissfully unaware of what else is in video memory and where it may be, or for that matter where in memory (or even which memory) they are writing. As a result of this memory virtualization it falls to the OS and video drivers to decide where in physical VRAM to allocate memory requests, and for the GTX 970 in particular, whether to put a request in the 3.5GB segment, the 512MB segment, or in the worst case scenario system memory over PCIe.


Virtual Address Space (Image Courtesy Dysprosia)

Without going quite so far to rehash the entire theory of memory management and caching, the goal of memory management in the case of the GTX 970 is to allocate resources over the entire 4GB of VRAM such that high-priority items end up in the fast segment and low-priority items end up in the slow segment. To do this NVIDIA focuses up to the first 3.5GB of memory allocations on the faster 3.5GB segment, and then finally for memory allocations beyond 3.5GB they turn to the 512MB segment, as there’s no benefit to using the slower segment so long as there’s available space in the faster segment.

The complex part of this process occurs once both memory segments are in use, at which point NVIDIA’s heuristics come into play to try to best determine which resources to allocate to which segments. How NVIDIA does this is very much a “secret sauce” scenario for the company, but from a high level identifying the type of resource and when it was last used are good ways to figure out where to send a resource. Frame buffers, render targets, UAVs, and other intermediate buffers for example are the last thing you want to send to the slow segment; meanwhile textures, resources not in active use (e.g. cached), and resources belonging to inactive applications would be great candidates to send off to the slower segment. The way NVIDIA describes the process we suspect there are even per-application optimizations in use, though NVIDIA can clearly handle generic cases as well.

From an API perspective this is applicable towards both graphics and compute, though it’s a safe bet that graphics is the more easily and accurately handled of the two thanks to the rigid nature of graphics rendering. Direct3D, OpenGL, CUDA, and OpenCL all see and have access to the full 4GB of memory available on the GTX 970, and from the perspective of the applications using these APIs the 4GB of memory is identical, the segments being abstracted. This is also why applications attempting to benchmark the memory in a piecemeal fashion will not find slow memory areas until the end of their run, as their earlier allocations will be in the fast segment and only finally spill over to the slow segment once the fast segment is full.

GeForce GTX 970 Addressable VRAM
API Memory
Direct3D 4GB
OpenGL 4GB
CUDA 4GB
OpenCL 4GB

The one remaining unknown element here (and something NVIDIA is still investigating) is why some users have been seeing total VRAM allocation top out at 3.5GB on a GTX 970, but go to 4GB on a GTX 980. Again from a high-level perspective all of this segmentation is abstracted, so games should not be aware of what’s going on under the hood.

Overall then the role of software in memory allocation is relatively straightforward since it’s layered on top of the segments. Applications have access to the full 4GB, and due to the fact that application memory space is virtualized the existence and usage of the memory segments is abstracted from the application, with the physical memory allocation handled by the OS and driver. Only after 3.5GB is requested – enough to fill the entire 3.5GB segment – does the 512MB segment get used, at which point NVIDIA attempts to place the least sensitive/important data in the slower segment.

Diving Deeper: The Maxwell 2 Memory Crossbar & ROP Partitions Practical Performance Possibilities & Closing Thoughts
Comments Locked

398 Comments

View All Comments

  • Kutark - Tuesday, January 27, 2015 - link

    b/c consumers are dumb and if the 970 had 1mb less ram than 4gb it would of decreased sales. There is a reason they moved away from stuff like having 1.2gb or 1.5gb, etc etc. People like big solid numbers.
  • HisDivineOrder - Tuesday, January 27, 2015 - link

    You give lawyers so much credit. Often, lawyers like being one to start such things and don't care much if they manage to finish it.
  • maximumGPU - Wednesday, January 28, 2015 - link

    I'd say both Jarred. Sure, i look at performance first, but performance metrics tell me how good the card is NOW. The next thing i do is look at the specs and try and estimate how future proof my purchase will be.
    A 3GB 970 would show great metrics at 1080p, but i wouldn't buy it because i know ram is ever more important thanks to the consoles catching up.
    Since i game at 1440p, 4GB was my minimum ram threshold. i thought i got that with the 970, but instead got 3.5 + 0.5GB of slow ram. That makes my card less future proof than i thought and could've well affected my buying decision, regardless of its current performance metrics.
  • Ranger101 - Tuesday, January 27, 2015 - link

    No surprises as to Nvidia's behaviour, as a company they are of course a rapacious juggernaut, but Tut Tut Anandtech, what would the great founder have to say?

    Having read many recent articles in the GPU section, I am mostly impressed by the high quality of writing, however those who read between the lines of Mr Smith's Gpu reviews, realise that appearances of impartiality in his writing are misleading and that he is in fact a staunch and unrelenting supporter of camp green. ( Everyone is of course biased, it's just less appropriate to let it shine through in technical website reviews.)

    It should therefore come as no surpise that in his initial review these issues "escaped" his attention, despite the fact that "a limited number of flags were raised" and that in the follow up article, he unashamedly wields the Bastard sword of Nvidia. LOL.

    You must remember these things happen for a reason Ryan and I once again encourage you to temper your bias in forthcoming utterances...AMD still make good cards and a little competition is good....right? :)
  • just4U - Tuesday, January 27, 2015 - link

    I think the fact that Ryan gets accused of being in favor of AMD and Nvidia means that's he's doing a pretty good job of not really being in either camp. If anything I'd simply suggest his expectations on performance are limited and when the cards actually do better.. he tends to point that out. Not really a bad thing considering how underwhelming hardware leaps are these days in most segments. Smaller jumps not the leaps and bounds we were all once used to.
  • OrphanageExplosion - Tuesday, January 27, 2015 - link

    Oh do behave. When Anandtech had the AMD News Center sponsorship all we ever heard from the commentariat was that the site, and Ryan specifically, were AMD biased. I think we all know where the bias is on Anandtech - and it's in the comments, not the editorial.
  • HisDivineOrder - Tuesday, January 27, 2015 - link

    You're talking about the same guy that just took it on AMD's word that Mantle was going to be "virtually identical" to the same low level access API as the Xbox One and that subsequently Mantle was AMD bringing the Xbox One's low level access language to PC gaming.

    Seriously. If the guy is biased toward anything, he's biased toward believing more of AMD's statements than he really ought to, but I've had a hard time really blaming him since AMD had JUST paid for him (and his buddy journalists) to go to Hawaii on a beach trip and vacation under the excuse that it was to present the GPU part called "Hawaii." I mean, if I was tatken to Hawaii, I'd probably be willing to believe anything they told me, too.

    Still, don't mistake the man for an nVidia fanboy. He's clearly not. Lots of other people questioned that AMD party line far more than Anandtech did back in the day and it took a long time before they acknowledged that AMD had hoodwinked them and they never REALLY admitted it wholeheartedly.

    Because AMD suggesting that Mantle was anything but a completely proprietary and locked-in API was a lie and no hardware company has yet to sign up in spite of the fact Intel tried very hard to research the subject and was rebuffed by AMD for months.

    Intel likes to do anything they can do for free and they read all the press (like Ryan's) that suggested Mantle was going to be free and freely available, but as it turned out, that was more hyperbole on the part of AMD.

    Yet I saw nothing of that on Anandtech. No, I don't think there's much evidence of his being "a staunch and unrelenting supporter of camp green" unless you're recalling the heady days of AMD's time as a "green" company.
  • Gothmoth - Tuesday, January 27, 2015 - link

    as if you read or even UNDERSTAND what ROP´s mean before you buy a card.....

    you and all the others are just trolls who have to much time on their hands....
  • Kutark - Tuesday, January 27, 2015 - link

    I honestly don't understand why people are so up in arms over this. At the end of the day the performance figures still stand. The situations in which this news could actually arise and cause any problems are so limited its not even funny. At the resolutions and settings most games operate at don't use anywhere close to 4gb of vram.

    Honestly if i didn't have SLI'd 760's i'd go out and buy a 970 tonight, regardless of any of this information.

    That being said, this is another article that proves why anandtech is easily the best tech website out there. Thorough and honest, unbiased, just, amazing, love it. Sorry for all the commas.
  • Kutark - Tuesday, January 27, 2015 - link

    Meant to say gamers, not games. Regardless.

Log in

Don't have an account? Sign up now