Tuesday 17 March 2015

Broadwell: It’s Business Time

Intel Broadwell

An awfully long time coming, Intel’s new 14nm CPU family is finally here. By Jeremy Laird

All systems are go with 14nm technology. If there’s a single message that Intel wants to hammer into your head with its new family of Broadwell CPUs, it’s this: 14nm is on line. Moore’s Law lives on.

The reality isn’t quite so simple. The reality is that Intel has fallen behind its self-imposed schedule of bringing out a new CPU architecture and then a new production process or die shrink in successive years.

As we’ll see, as chip features get ever smaller, the production process is becoming one hell of a challenge. As Intel’s CPU designs become ever more refined, the new architecture bit ain’t all that easy, either. Not if you’re expecting substantial gains in performance. That’s the context for the new Broadwell family. It’s running late. And it’s a lot to ask for it to be dramatically better than Intel’s already-excellent CPUs. And yet there are reasons to be very glad it’s finally arrived.


For starters, thanks to that tricky 14nm tech, Broadwell is by far the most energy efficient architecture Intel has ever created. No, that doesn’t automatically translate into epic increases in performance. But it does mean you can have proper PC performance in form factors where it wasn’t previously possible. Broadwell also represents yet another big leap in integrated graphics performance. And that begs the age-old question of whether great gaming performance on integrated graphics is finally possible.

On the other hand, Broadwell also reboots our ongoing concerns with Intel’s strategy regards the pure-CPU portion of its chips. For some time now, it’s felt like Intel has been sandbagging on the CPU side, something that’s been made possible thanks to the failure of its main rival, AMD, to offer serious competition.

Broadwell’s lateness is also creating confusion in the context of the anticipated arrival of its successor, known as Skylake. As we go to press, Intel hasn’t  actually announced full details of the desktop Broadwell range. But we’re expecting Skylake, including desktop derivatives, to arrive in the second half of this year. It’s all a bit baffling. So let’s start to make sense of Intel’s 14nm mess.

Something has gone wrong at Intel. It was back in June 2013 that we first got our hands on desktop variants of intel’s Haswell family of CPUs. Now here we are roughly two years later and intel has yet to announce desktop Broadwell processors.

Meanwhile the successor to Broadwell, an all-new architecture known as Skylake, is due later this year. For a company famed for its well-oiled and accurate execution of new product launches, of late at least, it’s all a bit untidy.

It also makes a mockery of protestations from intel that it’s “on track for a healthy 14nm ramp”. Undoubtedly and undeniably, 14nm has been problematical for Intel. But then the entire chip industry is increasingly struggling to keep Moore’s Law on line. Not that we should be surprised by that – current chip technology is fast approaching a tiny thing that’s going to be a big problem. The size of the atom. (See “No more moore”)

Anyway, Broadwell really ought to have been available in full back in 2014. As it is, intel only managed a trickle of dual-core mobile models late last year. At the same time, Intel says it’s sticking to its original plan of rolling out Skylake later this year. All of which begs two questions. What does Broadwell have to offer and is there any point to it for PC enthusiasts? Cue the intel marketing machine.

If you want the classic elevator pitch for Broadwell, it goes something like this: Broadwell is Intel’s fifth-generation Core processor. It’s the first Intel chip to be built on 14nm process technology and offers the smallest chip features ever. It’s much more efficient, with better battery life, and you get much faster graphics. Conspicuously absent from the pitch are big claims regards the CPU subsystem of the new chips. Hold that thought though, we’ll come back to it shortly. 

The core of the problem


So went the Intel PR pitch at multiple installments of its IDF techfest. Today two cores, tomorrow four, in years to come more than you could count.

Recently? Not so much. That’s reflected in the fact Intel’s mainstream processors still only have four cores, nearly nine years after its first quadcore CPUs were announced. So, what the hell happened?

There are two ways to look at this. Firstly, Intel is making CPUs with loads of cores, it’s just not putting them into PCs. Want a CPU with 18 cores? intel makes one for servers, albeit costing thousands. At best you can have an eight-core CPU in the high-end LGA2011 platform which is really just rebadged server technology.

You could also argue that those cores have migrated to graphics. All of Intel’s mainstream CPUs now have integrated graphics and if you take a peak at, say, the die shot of a dual-core chip with the big graphics core, it’s at least two-thirds graphics. A quad-core model would be roughly 50/50. And, of course, that big graphics ‘core’ has 48 execution units, which you could style as cores – suddenly you’ve got a 52-core chip.

Indeed, that graphics part of the chip can be used for some workloads that are traditionally CPU fare, such as video encoding. The problem with that view is that if you care about gaming, it’s pretty much irrelevant. You’ll have a proper discreet GPU and the gunk intel integrates onto its CPU simply doesn’t factor. That includes Broadwell’s beefedup graphics, as good as it is by integrated standards.

No, however you slice it, Intel’s mainstream CPUs are still four cores. Nearly a decade on, that’s disappointing.

Productivity Plus?


First, let’s knock about a few numbers that get to the heart of what Broadwell is about. Unfortunately, Intel has yet to release proper numbers or even model names for any desktop Broadwells, or even quad-core mobile Broadwells. So the best data available relates at 1.9 billion, and yet the new 14nm process means it’s also 37 per cent smaller in terms of die size.

Its graphics is claimed to be 22 per cent faster than the 4600U, hardware accelerated video encoding is a remarkable 50 per cent faster and battery life goes up by around 1.5 hours. Impressive. Less wonderful are the numbers relating to what you might call pure CPU performance. Intel is only claiming a 4 per cent improvement in productivity performance.

There are several reasons for that. Firstly, Intel’s x86 CPU cores are now very finely honed indeed. The low-hanging fruit in terms of performance optimisations were ripped from the tree long ago. If there’s anything left dangling, it’s just some very small stragglers on the hardest-to-reach branches.

What’s more, Intel has reduced the maximum Turbo frequency of the 5600U by 100mHz to 3.2GHz. So that modest 4 per cent gain is actually achieved at a lower peak operating speed. However, while the top Turbo speed is down a little, the base frequency is up by a hefty 500mHz to 2.6GHz. That should help in terms of sustained performance in thin and light laptops where heat can build up and prevent you from accessing the Turbo function. In that scenario, we reckon the 5600U might well be 20 per cent quicker for CPU tasks than the old 4600U.

Of course, we also need to remember that Broadwell is a Tick in Intel’s Tick-Tock chip development parlance. That means it’s all about the die shrink from 22nm to 14nm; it’s not meant to bring a dramatically new architecture. That said, Broadwell is being styled as a Tick-plus, and thus more than a simple die shrink. But the ‘plus’ bit is more about graphics and other features. In most regards, the CPU cores are carried over.

Anyway, that extra 1.5 hours of battery life is a big deal. It reflects an overall efficiency boost that will allow proper PC performance to be squeezed into yet smaller form factors. At this point in the Broadwell product cycle, it’s early days. Just keep your scanners peeled for ridiculously thin and light notebooks as well as just possibly some interesting tablets. We’d certainly love to see a Microsoft Surface with the most powerful Broadwell graphics options.

No more Moore?


Moore’s Law is the observation, first noted by Intel co-founder Gordon Moore way back in the 1960s, that transistor density in computer chips tend to double every couple of years.

Put another way, it means the complexity and power a computer chip of a given size doubles every two years. That, of course, gives rise to an exponential explosion in computing power over an extended time frame. Two becomes four and then eight, 16, 32, 64, 128 and so on. That’s why today’s desktop computers sport terraflops of raw compute power and why smartphones are faster than yesteryear’s PCs.

But for how much longer? A bit like constant prognostications that fossil fuels will run out any day now, you could say reports of the demise of Moore’s Law have been greatly exaggerated over the years. And yet with both oil and Moore’s Law, it can’t go on forever. One day the oil will run out. And one day Moore’s Law will hit the wall.

With oil, it’s hard to say exactly when. Nobody really knows how much oil is left in the ground. For existing chip technology, it’s a little easier. The width of the atoms used in current chip production are roughly 0.2nm. The smallest atoms are hydrogen and they rock in at about 0.1nm. Today, Intel is selling processors with 14nm features. Immediately, then, there are hard, physical limits to how small you can make the transistors or gates within chips.

That doesn’t necessarily limit the progression of computing per se. But at the very least we’ll need a new paradigm, be that quantum computing or something else. Indeed, there are clear signs that the computer chip industry is already struggling to keep moore’s Law on track as feature sizes bear down on that final limitation – the size of an atom.

IBM recently gave up making chips, for instance, and the cost of the facilities or foundaries for chip making are rising in an exponential fashion that almost matches the increasing power of the computers they enable. There’s at least a decade of further progress available. But after that, it really can’t go on too much longer.

Graphics Grunt


While we’re on the subject, what of the new graphics technology in Broadwell? Power efficiency aside, it’s got to be the most interesting aspect of the new architecture. In fact, it’s the latest in a long line of improved integrated graphics from Intel in recent years. Intel claims Broadwell represents a 100-fold increase in integrated graphics performance since 2006.

That sounds like an astonishing leap. However, the starting point in 2006 was so very low that the 100x metric doesn’t actually mean much. No, what really matters isn’t how much faster Broadwell graphics is compared to something ancient and absolutely awful. What matters is whether you can really game on it.

On paper, the changes for Broadwell do not seem hugely dramatic. The basic graphics core gets a bump from 20 execution units to 24 and the fastest configuration goes from 40 to 48 units. Then you note that Broadwell graphics are a continuation of Intel’s so-called Gen7 graphics architecture from Haswell and previous families of chips. And you might conclude we’re looking at worthwhile improvements, no doubt, but hardly the stuff of 100-fold leaps in gaming performance.

But that wouldn’t be entirely fair. Intel has given Gen7 a pretty hefty overhaul for Broadwell. It goes well beyond simply throwing a few more execution units at the problem of faster frame rates. For starters, Broadwell has been updated to support the latest graphics APIs. That specifically means Direct3D 11_2 now and support for Direct3D 12 (often simply referred to via the multimedia super set API that is DirectX 12) when it arrives with the final retail build of Windows 10.

Intel has also confirmed that Broadwell graphics is OpenCL 2.0 compliant, including shared virtual memory. So it really is a thoroughly modern graphics design and, for what it’s worth, has all the generalpurpose computing bells and whistles, too.

But it’s the structure of the rendering pipeline where things get more interesting for we gaming addicts. Intel’s graphics cores are subdivided into what you might call ‘slices’. For Haswell, each slice had 10 execution units. For Broadwell, this actually drops to eight.

If that sounds like a step backwards, the point is that each slice has shared resources in terms of cache memory and sampler units. With few execution units per slice, you get more cache and more sampling for each execution. Individual sampler performance has also been increased, the upshot of which is a 50 per cent bump in overall sampler performance. Intel has also tweaked the ROPs, or render outputs, and rasterisers to increase fill rates.

Luck of the Iris


The upshot of all of this is that the new mid-sized graphics core with 24 execution units, known as GT2, has nearly closed the gap with the old big graphics core, known as GT3, in Haswell. You can read the full details in our review of the new Gigabyte Brix S, with its Broadwell-U Core i5 5500-U chip. But in both Bioshock and Grid 2, Broadwell GT2 is remarkably close to Haswell GT3.

What it’s not, however, is what we’d really call fully gameable. In both games, average frame rates at 1080p and full detail with 4x antialiasing 
enabled are well below 30 frames per second. Even if you double the frame rates for the 48-unit Broadwell GT3, which we haven’t tested yet, and add a bit for the eDRAM memory that comes with the very fastest version, known as Iris Pro, we’re still talking marginal frame rates at 1080p in what are not exactly the most demanding games out there.

Actually, while we’re talking Iris Pro, it’s worth noting that both Intel’s baffling branding and its awkward positioning of its GPU technology looks like remaining with Broadwell. On the branding side, Intel is keeping the HD Graphics and Iris split. What’s confusing is that Broadwell CPUs with the fastest HD Graphics 6000 solutions will get the 48-unit core, just like Iris 6100 and Iris Pro 6200 graphics. To us it would be easier for everyone to understand what they’re buying if the 48-unit graphics was exclusively branded Iris.

As for the positioning, the problem has been that Intel has tended to restrict its fastest graphics to its higher-end CPU configurations. And that doesn’t actually make any sense, especially on the desktop, but even in laptops. That’s because any system with an expensive, relatively high-end quad-core CPU is going to have a discrete or plugin video card of some kind, making the specification of the integrated graphics essentially irrelevant.

Instead, it would would make much more sense to have a cheap dual-core CPU paired with the quickest graphics configuration and end up with something capable of a bit of light gaming for not all that much money. At least, it would make much more sense for us poor gamers and computing junkies. The problem is that big graphics means big computer chip.

Intel does do a dual-core chip with the big 48-unit graphics core for laptops, but you only have to glance at the die shot to understand the problem. The graphics portion of the chip is huge. It’s at least two-thirds of the chip, maybe more. The consequence is a mere dual-core PC processor with nearly two billions transistors.

Even taking into account the savings that come with a die shrink to 14nm, that pushes up the production cost and makes it tricky to sell a low-end model with the highest performing graphics. Eventually, graphics and CPU functionality will probably cease to be in any way distinct and the problem will disappear. For now, it means we can’t quite have the precise configurations we’d prefer.

Realsense and sensibility


If efficiency and graphics are the big deals, is there anything else to get excited about? Intel is talking up “enhanced capabilities” for Broadwell and they fall into a few categories. The first you might call more “natural” interactions.

According to the Intel spiel, with the PC we’ve gone from plain old text input to graphical user interfaces and then touchscreens and voice control. What’s next? Adding senses to the PC, namely the ability to recognise gestures and facial expressions along with more natural language recognition. Now this isn’t really a feature of a Broadwell CPU, but it does require processing power. And if you’re going to have that processing power available in today’s increasingly mobile form factors, you need the improved efficiency that Broadwell brings.

Intel is actually getting into this game itself with its own technology, known as Intel RealSense. The feature set is based around 3D-capable sensors. There’s a conventional colour camera in the centre, flanked on one side by an IR laser projector and on the other by an ir camera. Together, these two features allow objects to be scanned for depth and dimensions. Finally, there’s a stereo microphone array.

Put all of these sensors together and you have a system that supports gesture control using hands and fingers, facial expression detection, head tracking and object scanning, the latter enabling model creation for 3D printing. It also includes a Voice Assistant that has both basic offline and more sophisticated online components similar to Apple’s Siri and Microsoft’s Cortana voice control systems.

If that sounds like the kind of highconcept technology Intel often touts but fails to actually bring to market, well, realSense is actually shipping today in Dell’s Venue 8 tablet along with notebook PCs from Asus and Lenovo. That said, we’ve yet to see RealSense in action, so we certainly can’t judge whether it’ll catch on. What we can say for sure is that Intel is up against some seriously stiff competition from Apple, Google and Microsoft in this area.

Skylake looming


So, that’s Broadwell itself covered off. The obvious question is what’s next? Actually, that’s a more critical question than usual since Intel’s 14nm derailment means by the time desktop Broadwells arrive along with quad-core mobile models, the follow-up family of chips, known as Skylake, will be imminent too.

Full details of Skylake haven’t been released, but if we combine what intel has revealed with the best of the leaks and rumours, we can get a pretty good idea of what to expect. Firstly, Skylake is an Intel ‘Tock’ and that means a new architecture on the prevailing production process, which is now transitioning to 14nm.

As ever, graphics is a major part of the Skylake package, both literally and in terms of importance. Rumour has it that Skylake will introduce a new ‘GT4’ spec graphics configuration with no fewer than 72 execution units. That means nine of those eight-strong executionunit slices instead or six, or a straight 50 per cent boost in computational complexity.

Normally, we wouldn’t be hugely excited at the prospect of some faroff integrated graphics core. But Skylake is just around the corner and if it really does have 72 units, it’s going to be massively quicker than today’s best Haswell graphics.

That said, arguably more interesting are the platform developments that come with Skylake. First up, there’s a new socket, LGA1151, which breaks backwards compatibility. Where desktop Broadwells will drop into existing Intel 9 Series motherboards, Skylake chips will require a new motherboard.

If that’s the bad news, the good news is that the new socket means a properly new desktop platform along with new 100 Series chipsets topped out with the upcoming Z170. Reportedly, both Skylake and the new chipsets will support both existing DDR3 memory and the new fangled DDR4 stuff that’s so far been restricted to Intel’s highend LGA2011v3 socket and CPUs. However, actual motherboard implementation will see only one memory type appearing. We won’t see boards that can take both memory types.

DDR4 probably isn’t a big deal for mainstream desktop gaming rigs. But the boosted bandwidth should help keep those mooted 72 execution units in the new graphics engine fed with frame data. The other headline grabber is a bump for PCI Express lane availability, up to 20 lanes from 16. That matters because storage subsystems – specifically SSDs – are moving to the PCI Express-based M.2 interface. For dual-card multi-GPU SLI or Crossfire graphics, you want at least eight lanes per graphics card. With an M.2 drive pinching a few of your 16 lanes, there’s a problem. With 20 lanes, you’ve got four spare for a couple of fast SSDs. Perfect.

However, where things get really complicated is the relationship between Broadwell and Skylake on the desktop. Broadwell’s lateness combined with Intel’s plan to stick with its schedule for Skylake has created some distinct weirdness.

According to leaked Intel roadmaps, this is what it’s going to look like. Broadwell desktop CPUs will be the unlocked K-series models only. There won’t be any mainstream locked Broadwells on the desktop. Instead, those slots, which represent the bulk of Intel’s desktop offerings, will come from the new Skylake family.

It’s a really unusual situation and leaves a lot of questions unanswered. For instance, will there be unlocked Core i5 Broadwells with no Hyperthreading, as there have been for several generations? And what will Intel call all these chips? Will it launch what it calls fifth and sixth-generation Core processors? Or will it simply brand them all as fifth generation?

Then there’s the not-so-minor matter of multiple platforms and multiple sockets that lack cross compatibility. It’s bad enough that Intel now splits its desktop offerings into separate mainstream and highend offerings. It now seems that, for a while at least, you’ll have to choose between two mainstream platforms: the existing LGA1150, which Broadwell CPUs require, and Skylake’s new LGA1151 socket.

Double trouble


What’s more, the way Intel has done this is probably the opposite of the ideal. If you’ve forked out for an expensive unlocked chip, you’re likely going to want the latest platform technology with those extra PCI Express lanes and DDR4 support. But those will only be available with locked Skylake chips.

Whatever, it will be very odd to have what is an older architecture in Broadwell sitting above the newer Skylake models in intel’s hierarchy. At the very least, it possibly means that Skylake’s CPU cores won’t be anything to get excited about. After all, unlocked Broadwells selling as range toppers will need to be faster than their cheaper and locked Skylake siblings or the whole thing really will come tumbling down. All will become clearer in the coming months, but it’s odds-on that Intel will have to pull some marketing tricks to make sense of this weird mishmash of new products.

As for what happens after Skylake, we really are into crystal ball territory. There are a few known facts. Skylake will be followed by a family of 10nm die shrink chips called Cannonlake. Due in 2016, they’ll come with another new platform, the 200 Series chipset, and pair with DDR4 only.

All of which means we’ve mixed feelings about Broadwell. We’re disappointed that it likely won’t push on intel’s game for pure CPU core technology. But the power efficiency and graphics look like a big step forward and it’s nice that it should drop straight into existing 9 Series motherboards, albeit with a BIOS update required. How it will all work with Skylake arriving at nearly the same time, however, is far from clear and could determine whether Broadwell makes a mark or turns out to be too little too late.