How IBM Stacks Up Power8 Against Xeon Servers
October 13, 2015 Timothy Prickett Morgan
Since putting together the OpenPower Foundation two years ago, IBM and its partners have been working to get lower-cost Power8 machines into the field to better compete against the ubiquitous Xeon server platform. With the Power Systems LC machines announced last week, the gap is closing, at least according to the configurations that IBM is stacking up and the performance benchmarks that it has run.
IBM has thus far kept its competitive analysis to itself and its reseller partners, but The Next Platform has gotten its hands on what Big Blue is saying. We have reached out to Intel to get its thoughts on the new Power Systems LC machines and its own competitive analysis and will let you know what we learn.
It would be good to see both “Haswell” Xeon E3 v4 and E5 v3 machines tested and audited on a variety of benchmarks alongside Power8 machines by third parties. Customers should, as always, do precisely what Google is doing with Power8 machines, and that is run their own benchmarks using their own code. If IBM wants the Power platform to get 10 percent to 20 percent market share in the datacenter, as it has indicated is its goal, it is going to have to be a lot more aggressive about selling the architectural advantages and price competitiveness of its machines. The important thing is that IBM’s own competitive analysis shows that for certain kinds of workloads that are sensitive to memory bandwidth, a Power8-based machine is worthy of the hassle of a bake off, provided IBM will make machinery available for such tests.
As we detailed last week, IBM has launched a pair of new Power8 machines, code-named “Habanero” and “Firestone,” that have one or two Power8 sockets, respectively, and are positioned against Intel’s two-socket Xeon servers for data analytics, database, and HPC workloads, depending on the configuration. The Habanero system is called the Power Systems S812LC and was designed in conjunction with Tyan while the Firestone system is called the Power Systems S822LC and was made with the assistance of Wistron. (Both are server ODMs and both are hoping to get some leverage and potentially higher margins with Power-based servers.)
IBM has already been talking up the relative performance of Power8 systems compared to Xeon E5 machines with regard to Spark in-memory analytics workloads, as we discussed a month ago when IBM put out some benchmarks on the SparkBench suite of tests, which strain systems with a mix of streaming, SQL, machine learning, and graph analytics jobs. Back in June during the ISC 2015 supercomputing conference, IBM released some relative performance figures on a variety of HPC workloads pitting Haswells against the Power8s.
In its presentations, IBM says that the single-socket Habanero Power8 machine, with a maximum of 1 TB, has twice the main memory of a single-socket Xeon E5 machine in the Haswell generation and sixteen times that of a Broadwell Xeon E3. (IBM is using DDR3 memory, while the Xeons are using DDR4, which runs faster and cooler.) By IBM’s tests, it says that a Habanero machine with a single ten-core Power8 chip running at 2.92 GHz can deliver the same Spark performance at less than half the cost of a two-socket Xeon E5-2690 v3 machine, which has 24 cores running at 2.6 GHz. IBM compared the Power S812LC and a Hewlett-Packard DL380 system, and says that the Habanero delivers 2.3X better bang for the buck on Spark work.
The presentations that IBM put together for reseller partners this comparison was based on the ten tests in the SparkBench suite, and that across those tests, the IBM machine deliver 1.94X the performance of the Xeon E5 machine. Both machines were running Ubuntu Server 15.04, OpenJDK 1.8, and Spark 1.4.
To give a more general sense of how the Power S812LC stacks up against the Xeon E5 machines and its predecessor Power S812L systems, IBM put together this comparison based on SPECin_Rate integer benchmark tests:
As you can see, IBM has dialed back the performance on the LC variant of the single-socket Power8 machine, but it has dialed back the prices even further because of the change in processor and the move to standard DDR3 memory from its own custom memory. In the comparison above, IBM is assuming a 15 percent discount on the Power Systems LC hardware and a 20 percent discount on the software to get that total cost of acquisition (TCA) street price. The earlier Power Systems L and HP ProLiant systems have a 20 percent discount on both hardware and software to get that street price.
Workhorse To Workhorse
The real comparison that most companies will want to make is for two-socket machines, the workhorses of the datacenter.
On the Firestone Power S822LC machine, which has two sockets, IBM focused on relational database workloads. Big Blue tested a Firestone machine with two eight-core Power8 chips running at 3.6 GHz and with 256 GB of main memory against an HP ProLiant DL380 with two of the eighteen-core Xeon E5-2699 v3 chips. On the pgbench Postgres database test, the Power S822LC did around 33,000 transactions per second (TPS) per core, while the Xeon E5 did around 12,000 TPS per core. IBM wanted to talk about per core performance because that is where the spread is greatest, but if you work the math backwards, the Power8 Firestone machine could do about 528,000 TPS across the system and the ProLiant DL380 could do about 432,000 TPS.
Both of these machines had 256 GB of memory and ran Red Hat Enterprise Linux 7.1 on top of the KVM hypervisor and PostgreSQL 9.5 Alpha2. The way IBM prices the machines up, based on list prices, the Firestone machine offered about 40 percent better bang for the buck than the HP machine.
We would point out that this eighteen-core chip is very expensive compared to other Xeon E5 chips and PostgreSQL may or may not be able to take advantage of all of those threads. It would be illustrative to see if a pair of twelve-core Power8 chips, with 192 threads, would do much better than the Power8 machine IBM tested with a total of sixteen cores and 128 threads. The Intel machine had 36 cores and 72 threads. As is the case with all software, performance on any given hardware configuration will be dependent on how the software is architected. Sometimes it can take advantage of the threads, cache, and memory, and sometimes it can’t. The trick is to figure out the optimal configuration for your own workload, and that takes time and money.
It would be fun to see how a “Broadwell” Xeon D chip aimed at hyperscalers would do against a Power8 LC machine with the same workload. (Give a mouse a cookie, he wants a glass of milk. . . .)
IBM also ginned up some SPECint_Rate benchmark comparisons for its old and new two-socket Power8 machines and the HP ProLiant DL380. Take a look:
Again, IBM is gearing back the memory bandwidth by half on the Power Systems LC machine compared to the Power L machine, and the clock speeds on the Power8 chips are also dialed back a bit. Therefore, the SPEC integer performance is also lower. But the cost of the Firestone Power Systems S822LC machine is so much lower that it matches the bang for the buck of a Xeon E5 machine with two of those eighteen-core Xeon E6-2699 v3 processors. The discounting to get the street price is the same as on the comparisons for the Power Systems S812LC machine.