The Next Platform

Naples Opterons Give AMD A Second Chance In Servers

There are not a lot of second chances in the IT racket. AMD wants one, and we think, has earned one.

Such second chances are hard to come by, and we can rattle off a few of them because they are so rare. Intel pivoted from a memory maker to a processor maker in the mid-1980s, and has come to dominate compute in everything but handheld devices. In the mid-1990s, IBM failed to understand the RISC/Unix and X86 server waves swamping the datacenter and nearly went bankrupt and salvaged itself as software and services provider to glass houses. A decade later, Apple, which gave IBM a run for the money in the early days of the PC business and which was a true innovator of that platform, was able to resurrect itself with iPods and iPhones and wafer-thin MacBooks backed by online services. Microsoft has jumped from desktop to Internet to datacenter and is making the leap to cloud, and none of those transitions have been easy and it has made many costly mistakes along the way.

Back in the early 2000s, AMD was presented with a great opportunity, which it correctly saw and then exploited, much to the chagrin of long-time archrival Intel. At the time, Intel was steadfast in its strategy of keeping Xeon processors with 32-bit addressing and making its shiny new Itanium architecture, developed in conjunction with Hewlett Packard and supported by a slew of server makers like Sun Microsystems and IBM that promised to port their platforms to that chip, the only 64-bit option. With the “Hammer” line of chips, AMD defined not only 64-bit addressing for processors compatible with the Xeon’s X86 instruction set, but also created an architecture with a sophisticated point-to-point interconnect, called HyperTransport, to link multiple sockets together and scale up the compute and laid the foundation for multicore processors, too. When the first “SledgeHammer” Opteron processors debuted 18 months late in April 2003 for servers with one, two, four, or eight processor sockets, there was much fanfare. But more importantly, there was huge pent-up demand for a 64-bit chip that looked more like a Xeon with the familiar X86 instruction set and not at all like the quasi-compatible Itanium, which looked as much like an HP PA-RISC chip with VLIW extensions as anything else.

Intel stubbornly left a gaping hole in its product line, and AMD drove the Hammer family of Opteron chips right through it, and within the next few years had carved out a server market share that peaked at around 25 percent for its Opterons. AMD had some bugs in the “Barcelona” Opterons in September 2007, and unfortunately the delay in rollout out the chips and the rattling of confidence in AMD happened along with two events that were also coincidence: Intel’s own “Nehalem” Xeon redesign, which brought it many core capabilities as well as a new QuickPath Interconnect that functioned a lot like AMD’s HyperTransport and freed it from its bottlenecked FrontSide Bus for prior Xeons, and the Great Recession. With the economy getting increasingly unstable, AMD not inspiring confidence with its roadmap, and Intel effectively cloning the Opteron design (a funny twist, that one), Intel was set up to vanquish AMD from the datacenter. And as the recession rolled on and server makers got behind a radically improved Xeon server platform, that is precisely what Intel did. And thus, Intel’s Data Center Group can boast that the X86 architecture accounted for 99.2 percent shipment share and 83.5 percent revenue share in the final quarter of 2016, the last quarter for which such data is available, and that despite the costs of the “Skylake” Xeon ramp it was able to bring 43.6 percent of its $17.24 billion in revenues down to the middle line as operating income.

That $7.52 billion that Data Center Group contributes to Intel’s profitability is a ripe, juicy target, and because of the compatibility of the impeding “Naples” line of X86 server processors – we think AMD will not use the Opteron brand, but we will for the sake of these stories until the real brand is announced – with the Xeons that AMD has the best chance across the OpenPower and ARM collectives to make a dent in Intel’s server revenues and therefore its profits.

There is certainly a strong desire among Intel’s largest customers to see some competition in compute, which is why Google joined the OpenPower Foundation three years ago and why its top techie, Urs Hölzle, told The Next Platform back in April 2015 that Google would do anything to beat Moore’s Law improvements in compute, including shifting to a new architecture such as Power. Google was, of course, a big user of the original Opterons, as were a slew of cloud builders and service providers, including Rackspace Hosting and Microsoft Azure. With the ascendancy of Linux and features in Power and ARM chips to make it easier to port code from X86 architectures to them, it is an easier to make the case for these alternatives and work is being done to advance Power and ARM in the datacenter. Google and Rackspace have worked together to build the “Zaius” Power9 server platform, which is notable, and Supermicro is also now building Power-based machines. There are several server designs based on Cavium ThunderX and Applied Micro X-Gene 1 and X-Gene 2 processors as well. But thus far, Power and ARM are noise in the datacenter data.

In some ways, the push for both ARM and Power is our fault. AMD stopped having credible, high performance, alternative processors to Intel. If we show up again with a high performance, high quality proc, there is such a lower barrier to adoption of an alternative X86 than to a new architecture that I think it will be welcomed with open arms.

The easiest and obvious kind of competition is a processor that runs X86 code out of the box – or rather, in the box – and does not require any porting at all. (There is always tuning required, of course, if you want to squeeze the most performance out of any chip.) The high profit margin that Intel is enjoying with its Xeon products, the X86 compatibility of the future Naples chip, and the expectation that AMD and its chip manufacturing partner, GlobalFoundries, can get a bug-free processor out the door are why we believe AMD actually deserves a second chance in the datacenter. If the Naples chip comes out the door sometime in the second quarter, as expected and without issues, we expect for plenty of hyperscalers and cloud builders to endorse it, and the server OEMs and ODMs will have to follow suit, and perhaps with a lot more quick of a ramp than we saw with the server OEMs back in 2003 through 2005 when the Opterons were trying to get a toehold in the datacenter.

AMD has just lifted the veil a little bit more on the Naples processors, and we have covered the technical details that it revealed in a separate story this week, adding to some of the earlier information we learned about the Zen core at Hot Chips last August. But for now, we wanted to set the stage for the comeback that AMD is trying to engineer, and to do that, we chatted with Forrest Norrod, senior vice president and general manager of the Enterprise, Embedded, and Semi-Custom group at AMD.

Our contention is that this is as much AMD’s year to make it back into the datacenter as it is for the Power and ARM camps, and our position is one that Norrod shares.

“We talked about ARM a few years back, and in some ways, the push for both ARM and Power is our fault,” Norrod tells us. “AMD stopped having credible, high performance, alternative processors to Intel. If we show up again with a high performance, high quality proc, there is such a lower barrier to adoption of an alternative X86 than to a new architecture that I think it will be welcomed with open arms. I think this has been borne out as we have really started to engage with OEM and cloud builders and, more recently and selectively, the end user customers. The receptivity to Naples and our re-entry into the server market has been extraordinarily good. I mean, it is servers, so it is not going to ramp instantaneously. But I do think we have a great opportunity to pick up market share rapidly and reintroduce relevant competition.”

With the Intel market share even higher than it was back in 2003 through 2007 and the server market even larger, the chance for AMD to sell a lot of chips with even 5 percent to 10 percent market share is huge. Let’s have some fun with math. The X86 server market, according to IDC, brought in $45.6 billion in revenue in 2016, up a mere 1 percent, against shipments that fell by 1.6 percent to 9.39 million units. Intel’s Data Center Group posted that $17.24 billion in revenues, and let’s say for the sake of argument that 90 percent of it was for server chips, chipsets, and motherboards. That is $15.5 billion in chip and related server revenues for Intel, and maybe it had 47 percent gross margins for a total of $7.3 billion. That gave Intel a little more than a third of the revenues for all servers sold in the world (not including its flash storage, which is booked outside of Data Center Group), and we think something crazy like the lion’s share of the hardware gross profits. OEM and ODM server makers buy in high volume from Intel and sell with a markup based on volume to their customers, and the smaller the customer, the higher the mark up. Across 9.39 million X86 server units, the vast majority are for two-socket machines. Call it 18 million Xeon processors. That works out to an average of $862 per Xeon chip, and a gross profit of $405 each.

That’s some good money right there.

We think it is reasonable to assume that server units do not grow much this year, and that revenues will not, either. So if AMD can get a 5 percent share of server sales in 2017, which is reasonable given that it will only be selling them for maybe six to eight months, that would be 900,000 units. If you assume prices will be stable, on average, that would be $776 million to AMD and $14.74 billion to Intel, which works out to a 5 percent decline in revenue for Intel. We think AMD will have to charge less than 2016’s prices to get business and Intel will have to respond in kind, and maybe the whole cost of compute goes down just 10 percent by itself because there is actual competition in the X86 server market. Moore’s Law will be used to expand performance, but all other things being equal, Intel would have tried to keep the price of compute either flat or rising. (As we showed last April, the cost per unit of compute between the Nehalem and Broadwell Xeon generations has been flattening and sometimes rising, particularly for high core count and high clock speed parts.) If the price of compute drops by 10 percent, and AMD takes 5 percent share, Intel’s Xeon revenues (as we estimate them) will drop to $13.3 billion, but AMD is only going to bring in $698 million. The more probable scenario, perhaps, is that the price of compute drops by 10 percent and AMD gets 10 percent share because of the pent up demand, and if that happens, Intel’s revenues drop by 19 percent to $12.57 billion and AMD brings in $1.4 billion. They bring to the middle and bottom lines what they can, depending on what they have to do to sell the chips.

Assuming there is a 10 percent price war decrease in the average unit of compute and that AMD can get 10 percent shipment share and that it can charge about the same for compute as Intel, for Intel to hold revenues for Xeon chips constant in 2017 at what we estimate to be around $15.52 billion, then X86 server shipments would have to increase by 30 percent to 11.1 million machines. And X86 server chip revenues would have to expand by 11.2 percent to $17.24 billion and that would imply, at X86 server prices that maybe come down by 5 percent on average, 13 percent revenue growth for all X86 servers to $51.62 billion.

This is simply not going to happen. And if it did, AMD would take in $1.7 billion in revenues for server chips with its 10 percent share of shipments.

Data Center Group may make up the revenues in other areas, like Omni-Path interconnect and Knights processors and coprocessors, and it may even merge its flash, 3D XPoint, and other enterprise businesses into the Data Center Group to make the revenue decline look less dramatic. But unless the GlobalFoundries factories in Malta, New York and Dresden, Germany burn to the ground, Intel is going to get competition and it is going to hurt.

If the 32-core Naples chip comes out, and works, it will be competitive with the 28-core Skylake Xeon due around the middle of the year. Everyone who might buy lots of either chip has seen both, has had them running in their labs for a long time, and has already made their purchasing decisions. All we are arguing about now is the price that the rest of us might be charged if we can get either processor.

The die is already cast, even if it is too hot to touch.

Performance will matter just as much as price, of course, and Norrod says AMD can bring it, although he does not say anything about floating point performance, where Intel will be delivering AVX-512 vector math on standard Xeons for the first time with the Skylakes. But we know from the presentation that AMD made at Hot Chips last summer that the Zen core has a floating point unit that has a two-level scheduling queue, with two pipes each with one multiplier and one adder. This floating point unit supports two 128-bit vectors, which means it can do four double precision operations per clock or eight single precision operations per clock; we do not think FP16 half precision floating point is supported on the Zen core. By way of comparison, Intel is able to do twice as many floating point operations per clock with the current “Broadwell” cores used in the Xeon line, and four times as many with the “Knights Landing” Xeon Phi cores and the future Skylake cores that have two AVX-512 units. The Zen floating point block has two units for accelerating AES encryption and supports SSE, AVX1 and AVX2 floating point operations as well as SHA hashing and is compliant with legacy MMX and X87 instructions, too. Intel will have the floating point advantage, but then again, AMD has “Vega” GPUs for offloading math from its CPUs.

“If you take a look at the performance that we will be able to span in the Intel two-socket space, and if you look at general integer performance levels, particularly on throughput, we think that on a SPECint_rate we will be able to match Intel on the highest bin Skylake part,” Norrod says. “Which I do not think they are expecting.”

As for our estimates of market share possibilities, Norrod says AMD is not sharing its thoughts on this, but adds that AMD needs to be “in that realm” of 5 percent to 10 percent, as we pointed out, to be relevant, and adds that AMD “is in this for the long run.” That means taking each iteration of the Zen core as they come out, and adapt it to servers on the cadence that the market expects, which he says is around every 15 to 18 months. “We understand that is what the market demands, and we have invested in staff and resource to do that.”

We will know more when Naples ships towards the end of the second quarter, most likely with a fancy new brand, not Opteron.