Back in 2014, NVIDIA released the GeForce 900 series of cards. One of them, the GTX 970, received fairly positive reviews at the start. But then someone found out something odd: if you push the VRAM usage past 3.5 GB, performance starts to suffer. The card was advertised as being 4GB, and you could verify that it had 4GB based on the memory chips that were installed.
This caused a massive controversy that demanded NVIDIA’s response. And they did respond. Their explanation in a nutshell was there was some miscommunication between the engineering team and marketing. On the GTX 980, there are four main memory controller blocks, each containing a 32-bit channel for accessing one of the VRAM chips and two 256KiB blocks of L2 cache. On the GTX 970 however, one of these L2 cache blocks is marked off as faulty, and thus fused off. This created a situation where one of the 32-bit channels had to share an L2 block, which mean that parallel operations could not happen if the GPU was trying to access data in the two VRAM chips in these two 32-bit channels.
Needless to say, the public reaction was largely negative. People thought they only had a 3.5GB card, not a 4GB card as advertised on the box. NVIDIA promised a driver fix to make sure that if memory in that region was accessed, it would do so in a way to minimize performance impact. Needless to say, this fix likely never happened. The negative reaction was enough to earn NVIDIA a class action lawsuit that eventually did settle with everyone affected getting something like $25.
Its legacy lives on however as a 3.5GB card, and between that and the decision that NVIDIA made are the points I want to touch on.
About the decision to partition the VRAM
This is what I like to call an “engineering decision.” An engineering decision weighs figures out the options available and weighs the pros and cons of each option based on the engineering trifecta: cost, performance, and reliability. Though in this case, it was likely between balancing cost and performance.
Note that in the Maxwell 2 GPU of the GeForce 900 series, each memory controller has two 32-bit channels and two 256KiB blocks of L2 cache. The issue was mostly with the L2 cache (memory of any kind is a big eater of die area), so NVIDIA seemed to have two options:
- Any L2 cache block marks the entire memory controller as faulty. This not only would reduce the VRAM down to 3GB, but also the cache. This could’ve impacted the performance of the card even more than just running out of VRAM.
- Amusingly, there’s a GTX 960 for OEMs listed on Wikipedia’s list of NVIDIA’s GPUs that has 3GB and is using the GM204 GPU, the same one as the 970 and 980.
- They could’ve tried to make the memory controller fully functional, but this would also lower the yield of available GPUs.
So either NVIDIA could’ve dealt with even worse performance or a lower yield. Instead, NVIDIA went with the middle ground.
About the amount of memory on the card
While a lot of the tech community has relegated to saying that the GTX 970 has 3.5GB of VRAM no matter how you slice it. I’m in the camp that doesn’t believe that. It does have 4GB of VRAM available, on the simple notion that it’s all accessible. Does improper use of it hamper performance? Sure. But as far as it stands in the memory hierarchy, there’s 4GB accessible.
Something to support this idea is that on Intel’s systems with multi-channel memory, it’s actually not a requirement for all of the RAM sticks in the same channel to be of the same size to run in a multi-channel mode. Dubbed Flex Mode, it allows up to the lower capacity RAM stick multi-channel while the rest is accessed in single channel. So if you have a 4+8GB setup, the 4GB stick and 4GB of the 8GB one will be in dual-channel, while the remaining 4GB is in single channel mode. So if we were to claim the GTX 970 only has 3.5GB of VRAM because that’s the faster of the two, would be it right to say the 4GB + 8GB in Flex Mode setup only has 8GB of RAM?
To push this to absurd levels, let’s look at storage. I have a 250GB NVMe SSD, a 1TB SATA SSD, and a 1TB HDD. All of these perform at different levels. Would it be proper to only claim the storage capacity of my computer to be 250GB since it’s the fastest?
Ultimately, did it matter?
If anything, I felt that in hindsight, the only thing that came out of this controversy was that NVIDIA should’ve made sure marketing is clear about such things. That or to never try such doing such compromises like that again. Such compromises can confuse people. Seeing something like “3.5 GB + 0.5GB” doesn’t make sense at first glance (“wouldn’t it just be 4GB?”). But at the same time, this limits what NVIDIA can do in the future. Manufacturing can be a fickle mistress and parts can’t just come out the way they want. Though maybe this explains the oddball cards that are only released to the OEMs, like the GTX 1060 5GB.
There’s also the question of VRAM utilization in the first place. Were games back then using anywhere close to 3.5GB of VRAM for the performance category of the GTX 970? And even if it was and the drivers capped the usage at 3.5GB, what was the impact? There are a handful of games out there, notably Final Fantasy XV and recent Call of Duty games with the option “Fill available VRAM”, that will happily gobble up what VRAM you have available, but performance doesn’t seem to be affected. On top of this, the GTX 970 reviews at launch didn’t know anything about this and as far as I recall, nothing strange was reported. So for the games of the time, the GTX 970 was adequate.
And while people would like to buy a part for the future, the future is unpredictable. Games may not head in the direction you think. You may not even play the same types of games. It may have turned out that GTX 970s lasted quite a while in the hands of those that in the end, probably enjoyed their small rebate and thought nothing of the VRAM issue until their next upgrade.