Intel still has one more processor to get out the door to complete the “Haswell” generation, with the impending Xeon E5-4600 v3 for low-end, four-socket machines. But with higher end Xeon E7-4800 and E7-8800 v3 chips launched and representing the bulk of sales for four-socket and larger systems, most of IT shops will focus their attention on the Xeon E7s for jobs that need more scalability than the Xeon E5-2600 v3 processors, used in workhorse two-socket systems, can provide.
But the thing to remember is that more cores and more memory alone are not going to drive the upgrade cycle for Xeon E7 systems. There are a lot of different forces in effect in the datacenter right now, and these will help push Xeon E7 system sales – perhaps even more than Intel and its server partners expect.
At this point in history, after ten years of selling credible four-socket Xeon E7-class machines into the datacenter, it turns out that Intel’s server partners have a very large installed base of systems that are attractive upgrade targets for shiny new Xeon E7 v3 iron. But the performance increases between machines sold five or ten years ago and those that can be put into the field today are so large that, thanks to virtualization and consolidation, a large number of footprints might be removed from the datacenter, shrinking the Xeon E7 installed base considerably as machines are updated.
This is precisely what happened to the RISC and Itanium server bases over the past decade and a half since virtualization was introduced on Unix and proprietary systems back in the late 1990s. The database and transaction processing workloads on these mission critical, back-end systems do not grow like some of the web infrastructure workloads do and tend to follow a more modest curve that is reflective of the economy at large and normal growth. It is not easy and risky to generalize, but enterprises see something more akin to 25 percent or 30 percent growth per year in transaction processing instead of the more explosive growth seen for certain kinds of distributed computing.
This is wonderful for customers because they can put their big iron machines on a five or six year upgrade cycle (instead of the two, three, or four year cycle that is typical for distributed workloads running on Xeon E5 machines) and know at the end of that time they will not only get much more capable machines, but need fewer of them and less expensive ones (in terms of the price per unit of computing) when they do get around to upgrading some years hence.
But this fact has been disastrous for the Unix and proprietary systems businesses of IBM, Hewlett-Packard, Oracle, Fujitsu, Bull, and others. There is nothing they can do but ride the revenue curve down for those customers who stick with their platforms and lament the losses of customers who decide to go one step further and port their applications to Windows or Linux platforms running atop Xeon E7 machines.
The data in the chart above comes from IDC, and just under 210,000 machines were shipped in 2014 that had four sockets or more. About 9,000 machines had RISC or proprietary mainframe processors in them, and the remaining 94 percent were X86 architecture boxes, with the overwhelming majority of them being Xeon machines, not Opterons. The tiny fraction that was left is presumably the remaining few Itanium-based systems that get sold by HP, Bull, NEC, Inspur, and a few others. This high-end server market is not as big as many think – and it includes large shared memory systems like those made by SGI – but a couple of forces are driving its growth even though Moore’s Law allows a massive consolidation of machines.
“Our expectation is that technologies like in-memory will help it to grow. It has been pretty flat over the past few years, with slight ups and downs. But with in-memory, it opens up a whole new set of possibilities in the datacenter and we see a lot of people testing in-memory to find the best way to take advantage of it.”
Ed Goldman, who is CTO for the enterprise segment within Intel’s Data Center Group, says that 2009 was the only year in the past decade that the Xeon E7-class of machines had a revenue and shipment decline, and this was understandable given that this was the belly of the Great Recession and high-end machines are exactly the kind of boxes that are not replaced quickly during tough economic times.
In this case, whatever big iron machines were in the middle of being upgraded when the recession started in late 2007 and early 2008 were completed because of the long procurement cycle, and then sales stalled across all types of big iron in 2009 and 2010. Clusters based on two-socket machines were pushed out first and came back first, as is normal, because they have a faster procurement cycle and are often being used for greenfield applications, in this case Hadoop and other analytics workloads or virtual desktop infrastructure, which drive new revenues or cut costs.
With the broader adoption of in-memory processing (not just SAP HANA, but other technologies such as Pivotal GemFire and Apache Spark) and the Xeon E7 having twice the memory footprint of a Xeon two-socket E5-2600 or four-socket Xeon E5-4600 system, it stands to reason that as companies put more in-memory databases in the field, they will want to get the right balance of compute to memory for their workloads and this will drive Xeon E7 system sales. But don’t get too enthusiastic about a Xeon E7 explosion.
“Our expectation is that technologies like in-memory will help it to grow,” Goldman tells The Next Platform concerning the revenue stream from such big systems. “It has been pretty flat over the past few years, with slight ups and downs. But with in-memory, it opens up a whole new set of possibilities in the datacenter and we see a lot of people testing in-memory to find the best way to take advantage of it. Not every application is geared for scale out architecture. It takes a long time to switch the architecture, and certain problems need cache coherency and they need all of the features and functions of a scale up server. We expect to see single-digit growth in this environment – it is not something that is going to grow at 50 percent.”
While the machine count for scale-up machines is low, these systems still account for around 57 percent of revenues across all architectures, according to data cited by Intel in its briefings about the Xeon E7 v3 processors. (This seems a bit high to us.) The lowest hanging fruit for Xeon E7 v3 system sales is to get customers who are already using four-socket and larger Xeon machines and who haven’t upgraded. Customers who just bought an “Ivy Bridge-EX” Xeon E7 v2 system, which came to market fourteen months ago, are not very likely to upgrade, but customers using Xeon 7400 processors from 2010 or Xeon 7500s from 2011 are better prospects. The performance improvement, which can be expressed as a scaling factor, is quite impressive, as you can see:
It is hard to remember a time when we did not have multicore processors, but back with the Xeon MP in 2005, the chip only had a single core running at 3.66 GHz. Generally speaking, clock speeds have trended down a bit over time for the top-bin SKUs in the Xeon processors because you can get more throughput out of a chip by adding cores than you can by cranking clocks within a set thermal envelope. (Which hasn’t changed all that much in a decade, to be fair.)
Over the past decade, a four-socket Xeon machine running online transaction processing workloads has increased its throughput by a factor of 44X, which means you can have a single machine do the work of five racks of machines from a decade ago – as hard as that might be to believe. Because there are more devices hitting applications more frequently, and generating ever-larger amounts of data to be processed, the workloads keep growing and that is why there is even modest single-digit revenue growth for Intel for four-socket Xeon E5 and Xeon E7 processors. (The performance of the machines in the chart above is gauged using Intel’s own Warehouse OLTP benchmark, which simulates the data processing of a wholesale company managing its warehouses – the ones with forklifts, not parallel databases. We gave you OLTP Warehouse benchmark ratings for all of the past three generations of Xeon E7 processors in our coverage of the launch of the Haswell-EX chips.) Back in 2005, a four-socket machine could process 150.7 transactions per minute on the test, and using the top-bin E7-8890 v3, a four-socket machine can handle 6,602 TPM today.
No one is suggesting that customers even have 44 database servers and will consolidate them down to one machine. But what they might do is allow those databases to house more information on a new Xeon E7 box and then add in-memory processing to those databases to do more real-time analytics against production data without adversely affecting the performance of OLTP workloads running on the machine. You don’t have to move to SAP HANA to see benefits of more processing and memory capacity.
The thing to remember about upgrading big iron is that it is not just the processor that is changing. Customers are upgrading their operating systems and other systems software like databases and middleware, as well as other elements of the system, including faster and more capacious main memory and the addition of (relatively) affordable flash storage. All of these together can radically improve the performance of the system, more than the relative performance of the Xeon E7 processors alone imply.
Take server virtualization, for example, and start out with a Xeon X7460 four-socket machine from 2010, as shown in this comparison below:
The initial machine above had four six-core Xeon X7460 processors with 192 GB of memory plus 36 15K RPM SAS disks on an external JBOD disk enclosure; it cost $31,000, according to Intel. This system could support six virtual machines and had a relative performance of 94.28 on Intel’s own virtualization benchmark suite, which runs a set of infrastructure workloads on the VMs. Just upgrading that machine from the ESXi 4.1 hypervisor from, VMware to a more current ESXi 5.5 release gets about a 20 percent performance boost in terms of throughput while nearly doubling up the VM count to eleven; this drops the cost per VM by nearly half.
Upgrading this system to a four-socket machine based on E7-8890 v3 processors and loading it up with 1 TB of main memory and some 15K RPM disks yields a machine that can support 21 VMs at a cost of $50,500. Moving from 1 Gb/sec to 10 Gb/sec networking allows for more VMs to be added to the box, in this case doubling up to 42 VMs with an aggregate of 678.8 units of performance on the virtualization throughput test. To double the performance again (and about double the price of the system) Intel added four dozen 64 GB and two dozen 100 GB SATA SSDs to the JBOD enclosures, yielding 74 VMs with an aggregate throughput rating of 1,327 on the VM benchmark. Finally, Intel activated Single Root-I/O Virtualization (SR-IOV) on the 10 Gb/sec Ethernet cards, and it could put 90 VMs on the box and push performance up to 1,603 on the test. That is a factor of 17X improvement in performance with the cost per VM dropping by nearly 80 percent.
For in-memory workloads, a similar change can be accomplished using a mix of software and hardware upgrades. Take SAP HANA, for instance:
During its Xeon E7 v3 announcement, Intel said SAP HANA could be accelerated by 6X, but the chart above shows you how this was actually accomplished. The upgrade from HANA SP8 to SP9 accounts for an 80 percent boost all by itself. Upgrading from a 15-core Ivy Bridge Xeon E7 to an 18-core Haswell Xeon E7 boosts the performance by another 50 percent on top of that, and turning on the TSX transactional memory feature more than doubles the performance on top of that again to get to the 6X overall increase. The processor cores are the smallest part of the performance bump in that overall system upgrade.
Up next, we will take a look at the relative performance and price/performance of Xeon E7 systems compared to the Power and Sparc alternatives in the big iron arena.