While the Power8 processor has been available from IBM since April 2014, the chips were only rolled out in Big Blue’s biggest iron in October of that year and the company only ramped up its largest Power8 machines, the Power Systems E880, to their fully extended NUMA configurations last May. Now, IBM is rolling out new processing options for its biggest Power Systems iron, and at the same time it is doubling up the maximum memory on the top-end machines.
The changes are designed to give IBM’s largest server customers some different price/performance options on its Power E880 systems and to help the company chase more opportunities with its smaller Power E870 as well as the Power E880s running SAP HANA and Oracle 12c in-memory databases as well as its own DB2 BLU in-memory extensions on both AIX and Linux versions of the systems.
The Power E870 and E880 machines occupy the same upper echelon niche of the NUMA systems market that various machines based on Intel’s “Haswell” Xeon E7 v3 processors – including the “DragonHawk” Superdome X from Hewlett-Packard Enterprise and the UV 300 from SGI – as well as the Sparc M6 and M7 series from Oracle and the Sparc64 machines from Fujitsu. These are all NUMA machines that offer more scalability than is possible with standard two-socket and four-socket Xeon E5 and Xeon E7 machines based on Intel chipsets, which utterly dominate the datacenters of the world but which are not necessarily the base way to scale up performance – particularly among enterprise customers who do not control their own system software stacks, but rather buy such software.
We detailed the high-end Power E870 and E880 machines, which are the largest Power-based systems that IBM sells and which arguably far exceed the raw performance of its System z mainframes, back in May last year, when the Power E880 machine had its NUMA clustering extended to four nodes and a twelve-core Power8 chip running at just a hair over 4 GHz. IBM had previously launched a Power8 chip with eight cores running at 4.35 GHz for the Power E880 machines, with the initial scalability capped at two nodes when they shipped at the end of 2014, with the three-node and four-node options available in June 2015. We also revealed the feeds and speeds and market opportunity for IBM’s four-socket Power E850 servers, which are aimed squarely at stock four-socket Xeon E7 machines based on Intel chipsets. Here is how the various scale-up Power Systems machines stacked up before the new processor was added to the Power E880:
These high-end Power Systems are designed to span the wide range of performance that IBM offers from its two-socket Power 750 midrange machines, which supported both Power7 and Power7+ processors, all the way up to its 32-socket Power 795 machine, which was getting a little long in the tooth only supporting the Power7 chips, which debuted in February 2010. (The Power7+ chips came out in October 2012 and are socket compatible with the Power7 chips.)
All of the Power8 scale-up machines from IBM have the same basic node construction, with four of the twelve-core Power8 single chip modules (SCMs) providing the compute. Each of these sockets has NUMA clustering on the die, and IBM can link up to sixteen sockets together gluelessly – four of them linked on the local node through the backplane and the nodes get cross-coupled with fiber optic ports. IBM has its own buffered memory modules, based on its own “Centaur” memory buffer chip and employing DDR3 memory technology with the Power8 machines. (IBM could upgrade the main memory to DDR4 chips with the Power8+ processors coming out later this year, but we would guess that IBM will save that transition for the Power9 chip in its midrange and high-end servers.)
The Power8 E-Class server node also has eight PCI-Express 3.0 x16 peripheral slots, a number of which can be configured with its Coherent Accelerator Processor Interface (CAPI) for speeding up accesses between the Power8 processor complex and external flash or compute peripherals. All of this iron gets crammed into a 5U chassis, including power supplies and cooling fans.
In general, IBM offers a much skinnier processor lineup for its Power Systems customers than Intel offers with its Xeon E5s and Xeon E7s – usually just a handful of options with different core counts, clock speeds, and cache sizes. The Power8 chips with the most cores activated have lower clock speeds and are designed for overall system throughput, while the ones with the highest clock speeds are designed for higher single-threaded performance and to minimize software license costs for customers whose software suppliers use core-based pricing.
“What we were seeing from clients is an interest in a balanced performance and capacity point at a reasonable price,” Steve Sibley, director of worldwide product management for IBM’s Power Systems line, explains to The Next Platform. “If you think about the levels we now have with the Power E880, we have a very high performance 4.35 GHz chip to lower software costs with great throughput on the system, or if they were sitting on a Power 780 or Power 795 with fewer than 128 cores, it is just a great platform for them. The twelve-core Power8 running at 4 GHz goes all the way up to 192 cores, and is really there for the companies that are pushing the envelope in terms of scale. We have a certain number of clients who need everything we can give them. This new ten-core Power8 running at 4.19 GHz scales to 160 cores in a Power E880, gives an in-between point that is still more than our biggest Power 795 – about 14 percent more – but is at a little lower cost than the 192-core system.” That 192-core Power E880 offers about 50 percent more oomph than the top-end Power 795, previously the most capacious shared memory system offered by Big Blue.
The ten-core Power8 chip running at 4.19 GHz is available on the smaller, two-node Power E870 system, and some customers have been asking for this option.
The way IBM sells the Power E880, you buy each node fully populated with four processor cards and then you use capacity-on-demand features to activate individual cores. The four processor cards using the ten-core 4.19 GHz Power8 chips costs $195,049, plus another $9,057 per core for activations. Do the math, and activating all of those 40 cores on the node costs $557,329, or $13,933 per core. On the Power E880 using the eight-core Power8 chips running at 4.35 GHz, four cards with a total of 32 cores costs $156,039 and processor activations cost $9,057 each. That works out to $445,863 for the processor cores and core activations, for a cost of $13,933 again per core. (The faster cores do more work, obviously, so they are in a sense a better deal and offer lower software costs for per-core priced software.) With the twelve-core Power8 chips running at 4 GHz, the processor cards for the Power E880 cost $245,966 and activating cores costs the same $9,057 per core, for a total cost of $680,702, or $14,181 per core. What doesn’t make sense to us is why the core activation fee does not scale roughly with performance. But hardware pricing for all vendors can be a bit of a mystery.
For all of its E-Class Power8 machines, IBM offers hourly, daily, and monthly activations for processing capacity that is prorated based on the acquisition cost, so customers can rent some extra compute for a nominal fee when expected spikes happen, such as during end of week, end of month, and end of year processing for big ERP or sales systems.
The extra memory bump that IBM is also announcing is aimed squarely at enterprise workloads running on databases with in-memory features, and is designed to give the Power E870s and Power E880s an advantage over their Xeon E7, Sparc M, and Sparc64 competition. The new memory cards use the same Centaur memory buffer chip, but the card now has a whopping 256 GB capacity compared to the 128 GB capacity of the original Power8 high end machines. That yields 2 TB of memory per socket, which means 16 TB maximum for the Power E870 and 32 TB maximum for the Power E880. IBM could move to DDR4 memory with the Power8+ chip coming out later this year, but given the high cost of main memory and the desire by customers to protect their investment and stretch it out over a couple of years, we don’t think that will happen on the Power E850, Power E870, and Power E880 machines. (IBM could transition to DDR4 for its entry, scale-out systems, and the generic memory controller on the Power8 chip and the Centaur memory buffer chip are both capable of supporting DDR4 memory.)
The new 256 GB CDIMM memory card from IBM is sold in bundles of four for $64,240 and eight of these (which is enough to fully populate one four-socket node) costs $412,268. As with processors, once you buy the memory cards, you have to pay per gigabyte for activating the memory on the machines. (You can scale it up and down as you see fit, permanently or temporarily activating it.) Do the math, and turning on all of the memory in a Power8 node with four processors costs $156 per GB, but at 8 TB, that costs $1.28 million at list price. Per node. You can see why customers might want to preserve that investment.
By the way, IBM is charging the same price per capacity for its 256 GB CDIMM memory cards for the Power8 machines as it is charging for the 128 GB CDIMMs. In the broader server memory market, the cost of 64 GB and now 128 GB DDR4 DIMMs is considerably higher than for 16 GB and 32 GB DIMMs. Moreover, the memory prices are lower than what IBM is charging, of course, because the DDR4 memory sticks sold in most Xeon servers are much higher volume products. This is one reason why the OpenPower partners have tweaked their designs to put the Centaur memory buffers on the motherboard rather than the memory stick, allowing them to use more standard – and less expensive if less dense – main memory in their systems.
IBM does not provide financial results for its Power Systems division separately from its System z mainframes, so we don’t have much of a handle on how large this business is. But when we suggested that the Power E870 and Power E880 machines might represent something between 5 percent to 10 percent of revenues, depending on the quarter and where IBM is at with its Power Systems product cycle, Sibley said this was not a bad estimate. Ditto for our guess that revenues for these large systems was somewhere north of a third and somewhere south of a half of overall Power Systems revenues.
The point is, these big iron machines represent a pretty big portion of overall Power Systems sales, and an even larger portion of profits from this product line. Scale-out machines like the new low-end Power Systems LC models announced last October may be growing fast, but the top-end boxes are growing, too, Sibley did confirm, and largely because of in-memory processing for HANA, Oracle, and DB2 databases. There are some customers who are consolidating various AIX and IBM i workloads, too, who have four or five machines, each with a few hundred logical partitions, so don’t think all of the growth for big Power iron is coming from big databases, though.