Segmented Memory Allocation in Software

So far we’ve talked about the hardware, and having finally explained the hardware basis of segmented memory we can begin to understand the role software plays, and how software allocates memory among the two segments.

From a low-level perspective, video memory management under Windows is the domain of the combination of the operating system and the video drivers. Strictly speaking Windows controls video memory management – this being one of the big changes of Windows Vista and the Windows Display Driver Model – while the video drivers get a significant amount of input in hinting at how things should be laid out.

Meanwhile from an application’s perspective all video memory and its address space is virtual. This means that applications are writing to their own private space, blissfully unaware of what else is in video memory and where it may be, or for that matter where in memory (or even which memory) they are writing. As a result of this memory virtualization it falls to the OS and video drivers to decide where in physical VRAM to allocate memory requests, and for the GTX 970 in particular, whether to put a request in the 3.5GB segment, the 512MB segment, or in the worst case scenario system memory over PCIe.


Virtual Address Space (Image Courtesy Dysprosia)

Without going quite so far to rehash the entire theory of memory management and caching, the goal of memory management in the case of the GTX 970 is to allocate resources over the entire 4GB of VRAM such that high-priority items end up in the fast segment and low-priority items end up in the slow segment. To do this NVIDIA focuses up to the first 3.5GB of memory allocations on the faster 3.5GB segment, and then finally for memory allocations beyond 3.5GB they turn to the 512MB segment, as there’s no benefit to using the slower segment so long as there’s available space in the faster segment.

The complex part of this process occurs once both memory segments are in use, at which point NVIDIA’s heuristics come into play to try to best determine which resources to allocate to which segments. How NVIDIA does this is very much a “secret sauce” scenario for the company, but from a high level identifying the type of resource and when it was last used are good ways to figure out where to send a resource. Frame buffers, render targets, UAVs, and other intermediate buffers for example are the last thing you want to send to the slow segment; meanwhile textures, resources not in active use (e.g. cached), and resources belonging to inactive applications would be great candidates to send off to the slower segment. The way NVIDIA describes the process we suspect there are even per-application optimizations in use, though NVIDIA can clearly handle generic cases as well.

From an API perspective this is applicable towards both graphics and compute, though it’s a safe bet that graphics is the more easily and accurately handled of the two thanks to the rigid nature of graphics rendering. Direct3D, OpenGL, CUDA, and OpenCL all see and have access to the full 4GB of memory available on the GTX 970, and from the perspective of the applications using these APIs the 4GB of memory is identical, the segments being abstracted. This is also why applications attempting to benchmark the memory in a piecemeal fashion will not find slow memory areas until the end of their run, as their earlier allocations will be in the fast segment and only finally spill over to the slow segment once the fast segment is full.

GeForce GTX 970 Addressable VRAM
API Memory
Direct3D 4GB
OpenGL 4GB
CUDA 4GB
OpenCL 4GB

The one remaining unknown element here (and something NVIDIA is still investigating) is why some users have been seeing total VRAM allocation top out at 3.5GB on a GTX 970, but go to 4GB on a GTX 980. Again from a high-level perspective all of this segmentation is abstracted, so games should not be aware of what’s going on under the hood.

Overall then the role of software in memory allocation is relatively straightforward since it’s layered on top of the segments. Applications have access to the full 4GB, and due to the fact that application memory space is virtualized the existence and usage of the memory segments is abstracted from the application, with the physical memory allocation handled by the OS and driver. Only after 3.5GB is requested – enough to fill the entire 3.5GB segment – does the 512MB segment get used, at which point NVIDIA attempts to place the least sensitive/important data in the slower segment.

Diving Deeper: The Maxwell 2 Memory Crossbar & ROP Partitions Practical Performance Possibilities & Closing Thoughts
Comments Locked

398 Comments

View All Comments

  • HisDivineOrder - Tuesday, January 27, 2015 - link

    I think the theory laid out here for why nVidia would be a fool to lie assumes the lie was out the gate intended to be a lie OR that they could have just been the victim of a terrible mixup. I think the answer is somewhere in between.

    I think the far more likely scenario is they did not set out to lie to the press, but when the mixup happened and they discovered it (almost right away), they realized that they could wait a few months and let the thing play out through the holiday season. They would make a ton of sales, they could focus the press entirely on the performance given rather than the specs and when the truth was discovered they could shrug it off as unimportant because really performance was all that mattered. Not specs.

    The fact that they knew for months would mean little because ultimately the performance and benchmarks would still be (mostly) applicable and people who bought in got exactly what they were promised even if they didn't know to ask the precise question that would have illustrated greater weaknesses than they expected in the long run.

    So the deception carries on for months and then when pressed about it, delaying talking about it for a month (Dec-Jan, big sales month), they admit it after all the sales and virtually all the return periods are up. Then they shrug and say, "But the performance is the same anyway, so hey."

    That's the way they went. Imagine if they had not. Imagine instead if they had announced it as soon as they realized it after the initial reviews went out. Suddenly, the big story is not the amazing performance of the card, the value of the card compared to AMD's pricing at the time, or the percentage of performance you get compared to the nVidia high end. The story is how the press were mislead and had to change the specs. The story becomes what it is now, except without all the sales in front of it.

    Suddenly, the 970 has a stink of failure on it and people avoid it even though the performance is just as good as it seems. "nVidia tried to pull a fast one," people would say (like they are now). Except BEFORE all those sales happened. Now, the card won't sell and all because of a mixup in the marketing department. Now nVidia's got the stink of fail on them from being brave and admitting what they'd done by mistake, leading to story after story of how nVidia mistakenly mislabeled the card's technical specs.

    Tanking sales through the holiday season by a decent margin and costing nVidia tons of money.

    That's the lie, people. The lie is not the mixup as though they don't happen. They absolutely happen. The lie is nVidia not knowing almost immediately they'd mixed things up. You know they did. And unlike the writer of this article, I see a clear and easy motive for why they'd continue the lie. They wanted to stall and shrug and gesture and act like they were figuring out what happened right up until the cards they'd sold between November and December were all universally securely at home in buyer's possession.

    Once the holiday return periods were up and once the cards were mostly bought as much as they were going to be in the mad rush, that's when they fess up.

    It's the old adage: It's easier to be forgiven than ask permission.

    There's your motive for deceit. I'm not saying it's right. I'm just saying that's the motive and that's why they did it and that's the timeline for how they did it. The sad part is the article here is not wrong that if nVidia had made no mistake in the first place, the story would have been squarely on how great a value the 970 was.

    But after the mistake, nVidia had the choice of fessing up and losing a ton of sales to bad press surrounding a non-issue or stall for a few months until purchases were settled and unreturnable (mostly), then fess up instead and grin and say, "Whoops."
  • SunnyNW - Tuesday, January 27, 2015 - link

    Except for the people in the Forums are the ones that brought this up not nvidia on their own...Just so many cards had been sold an a larger percentage of people starting noticing issues. But of course they knew, I agree, not Initially but pretty soon after (within days For sure). Just everything played out (time-wise) as best as it could for nvidia, considering the circumstances.
    The issue here is The Performance of the card contrary to what most keep saying, the performance of the memory. The card simply does not act the same way as a "traditional" 4GB would. Yes the extra .5GB is better than system memory but that does not change the latter fact.
  • Expressionistix - Tuesday, January 27, 2015 - link

    Most of the people buying these things just use them to play video games on the computer - does anyone really care?
  • R. Hunt - Wednesday, January 28, 2015 - link

    Gamers pay good money for these things, so I don't see why not.
  • nos024 - Tuesday, January 27, 2015 - link

    wow...as if knowing this info changes all the benchmarks. i am more disappointed with the 128bit memory bus on 960gtx.
  • nos024 - Tuesday, January 27, 2015 - link

    Oh and i bought a brand spanking new 970gtx today despite after reading this article. Msi version.
  • Dr.Neale - Wednesday, January 28, 2015 - link

    Under the circumstances, I strongly believe that NVidia should be forced to accept the return of any 970 the customer no longer wants to own, on the grounds that it does NOT MEET THE PUBLISHED SPECIFICATIONS and is therefore DEFECTIVE in that it was NOT AS DESCRIBED.

    For example, AMAZON has exactly this policy, giving the customer (at least) 90 days to return any such product sold through Amazon Marketplace, for a full refund of all costs.

    Now that NVidia has admitted that the original published specs are NOT MET by EVERY SINGLE 970 card, they would have no way to deny any customer claim.

    I believe that Consumer Protection Laws would also dictate that a full refund must be issued within a reasonable time after the defect is "found".

    So, to those who are unhappy with their 970 purchase, use this as a means to get a full refund, and buy something else instead.

    To those who aren't willing to give up their wonderful 970, simply accept the fact that this memory defect is main reason the 970 is so much cheaper than the defect-free 980, and move on.

    I further believe it would be in NVidia's long-term interests to facilitate the return of any unwanted cards, and to offer some freebie to compensate those willing to keep their 970 cards, despite the defect.

    Anything less is unacceptable.
  • GGlover - Wednesday, January 28, 2015 - link

    Early adopter here. I paid for 2 970's over 1 980 because I was lead to believe that the specs were extremely close and that the 2 970's were slightly cheaper than a single 980. I had believed that they would perform better than a single 980 (extra ram etc.). I would have probably gotten a 980 had I known that there was in fact a much larger difference in specs. Real world performance or not. The numbers weren't really in at that time. So I was misled by a bait and switch.
  • Oxford Guy - Thursday, January 29, 2015 - link

    SLI is definitely the biggest problem Nvidia is facing.

    This article's author said he couldn't think of a reason why Nvidia would benefit from misleading consumers, but SLI purchasing decisions are heavily influenced by the VRAM amount on a card. Having the 980 be ostensibly the same in terms of VRAM was a very significant factor as well as the claimed amount for the 970 by itself.
  • jbluzb - Wednesday, January 28, 2015 - link

    I do not like their unlawful business practice of false advertising. They waited after the Christmas season is over before acknowledging that there was indeed a problem in the reported specs.

    That is what really turned me off from the company. This will be last NVIDIA card that I will ever buy because I do not want to support a company who does such things to its customer.

    Also, I it made me weary of review websites. It is big eye opener for me ---- they are just different websites handled by a marketing team. They cannot talk negatively about a company because they are a major sponsor. There is no such thing as truth in journalism. :(

Log in

Don't have an account? Sign up now