Thursday 27 August 2015

The Future of the GPU

The Future of the GPU

It could be the perfect time to upgrade your current graphics card, but what do you need to prepare for the GPU future?

The graphics card is the component most responsible for PC gaming performance. Above everything else in your PC. You could have the most powerful, £800 octo-core Haswell-E CPU in your rig, but if you’ve got a weedy GPU backing it up, you’re never going to be hitting the graphical heights that today’s PC games deserve.


And it’s a great time to buy a new graphics card right now. Both the leading lights of GPU development – Nvidia and AMD – have finally released their graphics card lineups for this year, with highend, ultra-enthusiast GPUs and superefficient, lower-end offerings. And by the way, for ultra-enthusiast, read: eyewateringly ‘wtf-how-much?!’ expensive.

While Nvidia has had it pretty much all its own way since the GTX 980 was released almost a year ago, AMD has finally responded with a slew of new – and some not so new – GPUs to try and put it back in the game. Correspondingly, Nvidia has updated its range and dropped the prices here and there. Who wins? We all do, of course. You can now go and bag yourself a quality, high-end graphics card for some pretty reasonable sums of money. Which is why this month we’ve got them all into one room for a GPU battle royale.

If this is the state of play in the graphics card market right now though, what does the future hold for this pedigree racehorse of components? Are we likely to have genuinely affordable, genuinely capable GPUs delivering the 4K feels on the next generation of high-resolution gaming monitors? And is the end nigh for the classic peripheral component interconnect express?

Both Nvidia and AMD are set for big new GPU architectures on incredibly tiny production processes in the next year, having both missed out on the bonanza that 20nm lithography was meant to offer. It’s set to be a very intriguing time for the not-so-humble GPU then, and with the rise in screen resolution and the burgeoning VR industry’s insatiable thirst for GPU power, it needs to be. Let’s do some digging and see if we can figure it out what’s going on…

Before we go too far into a future filled with high-bandwidth memory (HBM), new component interconnects and new GPU architectures, there are still a few holes to be plugged in AMD and Nvidia’s respective graphics card lineups.

Scheduled to arrive very soon, possibly by the time you read this, is Nvidia’s replacement for the GTX 750 Ti – inevitably named the GTX 950 Ti. The GTX 750 Ti was the first Maxwell-powered graphics card and it makes sense for it to now be refreshed with new silicon. It’s more than likely to be sporting a slightly cropped version of the GM 206 GPU found in the current GTX 960.

Like the GTX 750 Ti before it, the GTX 950 Ti (probably set to release alongside the GTX 950) should offer impressive levels of power efficiency combined with decent 1080p gaming performance, too.

To counter it, AMD is looking to try and spoil the low-end GPU party with its own Radeon 370X, a Trinidad GPU-powered card aiming squarely at the same price point as the GTX 950 Ti. It will essentially be using the same Pitcairn GPU that AMD filled the R9 270X out with, and it will be interesting to see who comes out on top in the battle at the bottomof themarket.

There are also rumours that AMD is hard at work putting together a full range of HBM-supported graphics cards to follow the Fiji model used in the Fury cards. Whether that will be as part of an interim refresh of the current chips isn’t known, but that’s probably unlikely. We expect the current lineup to last until the next AMD GPU architecture drops next year.

BUT WHAT’S NEXT?


With the Maxwell GPU architecture having been around for a good long while now – since early 2014 with the GTX 750 Ti and from last September with the full-fat GTX 980 cores – it’s time to start thinking about what’s coming next.

The next generation of graphics cards from both Nvidia and AMD is going to see a major cut in the production process. This is the big news from the next round of GPU architecture updates, and also the reason for this current generation being something slightly different to what we originally expected.

When the two companies first started talking about their Maxwell and Pirate Islands GPU ranges, it was largely expected that thesewould be the first chips to tip up rocking the new 20nm production process. And it wasn’t just us expecting that either – both the GPU makers thought they’d bemaking themove.

However, the 20nm process turned out to be a nightmare for the silicon makers to produce chips with at a consistent yield without losing a bunch to defective parts in the baking process. This made the whole 20nm lithography seriously expensive. Tied to the fact that it wasn’t actually delivering much in the way of performance or efficiency gains, it’s unsurprising that the switchwasn’t deemedworth it.

So Nvidia and AMD have been stuck on the existing 28nm process for at least one generation longer than either really expected. Nvidia, however, seemed to see the writing on the wall, and, with the already-efficient Maxwell architecture, it was still able to deliver improved GPUs. AMD, on the other hand, has stuck with its existing architecture and simply piled more and more silicon into the design to boost performance.

But the new 2016 GPU architectures from AMD and Nvidia won’t be on the 20nm process either. That ship has sailed and now we’re expecting both companies to move their chip production process to the new 16nm FinFET (similar to Intel’s Tri-Gate 3D transistors) lithography. This will allow far more transistors to be packed into the same, or smaller, die size and yield greater efficiency gains, too.

BLAISE OF GLORY


On the Nvidia side, we’re looking at an architecture called Pascal – named after the French physicist Blaise Pascal – and the rumour that the successor to the fullfat GM 200 GPU could have as many as double the transistor count. That would give it somewhere upwards of 16 billion transistors. That phrase needs to be read in your head one more time with Carl Sagan’swondrous tones.

The Pascal GPU will be the first of Nvidia’s cards to offer 3D memory and is set to use the second generation HBM 2.0 to achieve the purported 32GB maximum frame buffer. One of the struggles with the current HBM tech used on AMD’s Fiji cards is that it has a limit of 2Gb per DRAM die, making a maximum of 1GB in a stack, and only four memory stacks per GPU interposer. That’s why the Fury cards have a slightly miserly, though speedy, 4GB frame buffer.

HBM2.0 though is designed tomassively boost that upper limitwith a limit of 8Gb per die and stacks offering either four or eight dies piled on top of each other. That will give each stack amaximumof either 4GBor 8GB in capacity.With four of those HBM 2.0 stacks arrayed on the interposer around the GPU itself, you’re looking at either 16GB or 32GB frame buffers, depending on the SKU.

Pascal is looking to unify its memory, too, making it available to both CPU and GPU. In traditional interfaces, that would introduce latency issues across the PCIe connection when communicating between CPU and GPU. But with Pascal, Nvidia is introducing a new interface called NVLink. On our PCs, however, NVLink-proper looks a while off (see “Is NVLink the end for PCIe?”, formore on that).

AMD ADVANCES


AMD’s Arctic Islands architecture – also due in 2016 – could be AMD’s first new GPU architecture since the inception of the Graphics Core Next design at the beginning of 2012. It has mentioned a doubling of the performance-per-watt efficiency of its high-performance range of GPUs.

It’s unlikely to be too radical a departure from the current GCN architecture though, especially given mixing a new production process with a brand new architecture can be a recipe for disaster. Though that is also the route Nvidia is taking with Pascal…

What we do know is that the successor to the top Fiji GPU of today will have the Greenland codename and will sport the same second-gen memory architecture as the Nvidia cards – HBM 2.0. That will mean huge potential frame buffers all round. The Arctic Islands range will also utilise the 16nm FinFET technology, which is arguably how it’s going to be able to nail the 2x perf-per-watt target that AMD has set itself.

With the introduction of the new lithography and the promise of high bandwidth memory being used throughout the GPU stack, we’re pretty confident that Arctic Islands won’t suffer from the same rebrand-a-thon woes that have somewhat blighted the current Southern Islands/R300 series release.

All in all, 2016 is looking a seriously exciting year in terms of graphics cards. The efficiency gains from the 16nm lithography will keep things highly chilled in the mid-range, but also allow for some absolute monster GPUs at the top end. Hell, we could be looking towards 8K gaming by then, guys and gals.

Is NVLink the end for PCIe?


ALONG WITHthe announcement of the Pascal architecture, Nvidia CEO Jen-Hsun Huang also introduced theworld toNVLink, an interconnect for its GPUs that could potentially offer between five and 12 times the bandwidth of the current PCIe 3.0 connection.

Nvidia’s talking about NVLink offering DRAM-class speed and latency, which will allow for the use of Pascal’s unified memory across the entire PC. It will also improve performance betweenGPUs, so multi-GPU systems could end up getting a farmore linear scaling in terms of gaming speed.

As well as the NVLink connection on the GPU itself, it will also require dedicated silicon in the CPU if it wants to bypass the PCIe interface completely. From the outset though, that looks likely to be restricted to supercomputer-class high performance computing (HPC); Intel is unlikely to start dropping Nvidia silicon into its designs.

But if there’s no path to the CPU, NVLink can just dedicate all its available bandwidth to GPU-to-GPU connections, which is what will potentially enable it to bear fruit in our gaming PCs.

Right now we’re a fair way off saturating the available PCIe bandwidth on our rigs. The current interconnect is fine for our presentday needs, but boosting SLI scaling could be a real bonus. In terms of HPC applications, however, there are times when programs are doing large pro-level processing on the GPU – such as image processing for astronomy or seismic processing – and the PCIe interface becomes a serious bottleneck.

For our machines, that’s not going to be a problem, and AMD shows no sign of wanting to shift interfaces either. We can’t see PCIe going dodo any time soon, at least not in the next couple of years.