Even though the Xeon processor has become the default engine for most kinds of compute in the datacenter, it is by no means to only option that is available to large enterprises that can afford to indulge in different kinds of systems because they do not have to homogenize their systems as hyperscalers must if they are to keep their IT costs in check.
Sometimes, there are benefits to being smaller, and the ability to pick point solutions that are good for a specific job is one of them. This has been the hallmark of the high-end of computing since data processing systems were first developed many decades ago, and it continues to be the case with supercomputing and other exotic types of infrastructure at large and sophisticated enterprises.
It is with this in mind that we contemplate the new “Sonoma” S7 processor, which Oracle unveiled last summer and which is at the heart of the new Sparc S7 systems that made their initial debut in late June. Like other alternatives to the Xeon, the Sparc S7 processor has to demonstrate performance and value advantages compared to the Xeon – it is not sufficient to be compatible with prior Sparc processors and show price/performance improvements against those earlier generations of Sparcs. The Xeon processor so utterly dominates the modern datacenter and is such a safe choice that Sparc, Power, or ARM processors have to meet or beat it if they have any hope of getting traction.
According to the benchmarks that Oracle has put together for the Sparc S7 systems, these machines can compete effectively against modern Xeon E5 processors, particularly for workloads that require relatively brawny cores and high clock speeds and perhaps especially for software that is priced per core, as Oracle’s own database and middleware software is.
Given Oracle’s key business of peddling relational database software – it has over 310,000 customers worldwide using its eponymous database software – you would expect for the Sparc S7 processors aimed at two-socket machines and M7 processors aimed at larger NUMA machines would be tricked out to accelerate databases and to offer very competitive performance. And according to the benchmark tests that Oracle has run, this is the case. But Oracle is also interested in running other workloads on the S7 systems, and has run benchmarks that show the machines to be competitive running Java application, analytics, and NoSQL tests.
Oracle’s desire is to position the S7 systems directly against Xeon systems for database workloads and to do more work with fewer cores, which plays into its strategy of lowering the cost of its software to help promote its hardware.
Oracle uses a processor core scaling factor to adjust its database pricing based on core counts and architecture, with IBM Power and Intel Itanium processors having to pay full price per core but modern Xeon E5 chips as well as the most recent Sparc T, S, and M series chips from Oracle have a 0.5 scaling factor on the core counts, which means they get a 50 percent discount for software licenses. (You can see the scaling factors, which were first introduced in 2009, at this link.) With the scaling factors being the same on the Xeon and Sparc S7 processors, the odds are even here, but compared to chips with brawnier cores, like the Power8 chip from IBM, Oracle gives its own S7 and M7 platforms a software pricing advantage because of the core scaling factor.
In any event, Oracle’s tests show that its S7 cores can significantly surpass the performance of Intel’s “Broadwell” Xeon E5 cores, which made their debut at the end of March. In the comparison above, Oracle is pitting its two-socket Sparc S7-2 server, which has two eight-core S7 processors running at 4.27 GHz, against a two-socket ProLiant DL360 G9 server that has two of the top-bin Broadwell Xeon E5-2699 v4 processors, with 22 cores running at 2.2 GHz each. The Sparc S7 system was able to process 173,493 transactions per minute (TPM) per core using a mix of online transaction processing and analytics workloads, compared to 110,342 TPM per core for the ProLiant DL360 G9 machine.
This chart is a little subtle because you might be thinking that this is a per system benchmark test, and you have to multiply it out to get the overall OLTP throughput of the machine. (It also says “tpmC/core,” which suggests that it is an internally run variant of the TPC-C OLTP test, but you are not supposed to publish TPC tests that are not audited by the Transaction Processing Council, even though vendors do it all the time.)
Whatever the OLTP test is, the Sparc S7-2 machine is able to do 2.77 million TPM compared to 4.85 million TPM for the ProLiant DL360 G9 system. We were not able to get official pricing on these configurations from Oracle, but assume these are heavily configured machines. A Sparc S7-2 machine with 1 TB of main memory and two 600 GB disks costs $49,282; this includes licenses to the Solaris variant of the Unix operating system. HPE does not allow configurations to be tweaked online anymore (you have to go through channel partners online and take the configurations they are offering unless you are an enterprise customer requiring special hand-holding, as far as we can tell), but if you price up a Dell PowerEdge R730 server with two of the top-bin Broadwell Xeons and 1 TB of memory, the machine has a list price of $38,398 not including an operating system. If you add in the most expensive license for Red Hat Enterprise Linux with unlimited virtualization to this machine, it would cost $43,117. If you do the math on that, it costs $17,800 per million TPM for the Sparc S7-2 machine and $8,900 per million TPM for the Xeon E5 machine. It is a bit of a wonder why Oracle brought up these results, but the idea is that the machines are comparably priced and that certain aspects of the workloads can be accelerated and that customers can run analytics and OLTP on the same systems, according to Marshall Choy, vice president of product management for systems at Oracle.
When it comes to Java applications, which are a key aspect of Oracle’s business, the company is again stressing the per-core performance of the S7 processor compared to the latest Xeon E5s, and it is using the SPECjbb2015 test to line its systems up against Intel’s partners. Take a look:
The numbers in this chart have been slightly tweaked and you can see a comparison of the SPECjbb2015 results for Sparc S7 and various Xeon machines at this link. The interesting bit is comparing the maximum throughput on the SPECjbb2015 test to the critical performance, which is when service level agreements and security are imposed on the benchmark. Look at this table:
This table shows the performance of the SPECjbb2015 test with multiple JVMs on a single system (there are other ways to run the test). The important thing here is how the performance if the S7-2 system compares to the Power8, M7, and Xeon E5 v4 machines. The Xeon machines still beat out the S7, but the S7s cost a lot less than the M7s, too. It would be useful if all vendors had to submit pricing for the SPEC tests, but they don’t, so making price/performance comparisons is difficult. Suffice it to say, with the S7s, Oracle can reduce the core count compared to its prior generations of one-socket or two-socket systems and with aggressive pricing it can compete against other RISC iron and even X86 machinery. Choy tells The Next Platform that customers right now are doing proofs of concept to replace Xeon iron with S7 systems.
The following chart explains why better than the OLTP and Java benchmarks outlined above:
In this comparison, Choy is showing how many Xeon servers using twelve-core processors are needed to deliver 2.8 million Java operations per second on a SPECjbb2015 test compared to delivering the same performance using the Sparc S7-2 systems with eight-core S7 chips. The S7 setup has fewer servers and fewer cores to deliver the same workload, and by Oracle’s calculations, including operating systems and three years of support, costs a lot less. To be specific, the S7 setup has 42 percent fewer cores and 47 percent lower hardware costs, and would also have 42 percent lower software costs using the Oracle software stack, too, since it is priced on a per-core basis. Switching to higher bin Xeons would cut the system count on the X86 stack, but it would not change the core count appreciably.
“We are really driving the efficiency message here, obviously, in that customers can reduce that footprint,” says Choy. “They can reduce that total cost of ownership for the hardware, but really, it is the licensing at the end of the day that is going to make a big difference in the customer’s wallet.”
It is not clear what Java application test is used in the comparison above, but it is not the SPECjbb2015 test given the performance that Oracle is showing for the Sparc iron.
On the SPECjEnterprise2010 benchmark, the S7s also hold their own against alternatives on a per-core basis, but again, Intel is cramming more cores into a box with its top-bin parts to boost the aggregate throughput per machine:
In the table above, the IBM Power Systems S824 has only two processors, at twelve cores each, not the four processors listed by Oracle. (Although technically, the S824 is using two six-core chips in a single socket. The point is that it is a two-socket machine, not a four-socket one as you might think from the labeling.) You can see why Oracle is emphasizing per core performance with the Sparc S7 machines.
For in-memory processing, which makes use of the database acceleration (DAX) units on the Sparc S7 cores to speed up certain functions, Oracle is showing off a dramatic reduction in footprint. To run a 1 TB database in memory using the Oracle 12c database, which can compress it down to 120 GB of physical memory running across a cluster of machines, requires ten server nodes and a total of 160 cores. To get the same level of performance using a Xeon cluster would take 40 machines using the twelve-core Xeon E5 processors, according to Oracle. (Using top-bin Xeon parts would cut the node count in half, but would not change the core count.) Oracle did not reveal pricing in this comparison, but assuming it is consistent with the pricing on the Java machines above, the HPE DL360 G9 rack shown above would cost around $1.43 million and the Sparc S7 machinery would cost around $220,000. So the price/performance advantage would go to Oracle in this case. Although by how much would depend on the discounting on the Oracle database running on the clusters above.
Oracle is also showing off the performance of the S7 machines on AES encryption, Yahoo cloud serving with a NoSQL datastore, and real-time analytics and OLTP. The latter benchmark puts the Sparc S7 machine within spitting distance of a Xeon system using two eighteen-core “Haswell” E5-2699 v3 processors on OLTP workloads – 195,790 transactions per second for the S7-2 machine versus 216,302 for the Xeon E5 machine. But the Sparc S7-2 is also able to process 107 queries per minute at the same time as running the OLTP workload, compared to 47 queries per minute of the Xeon E5 system. Oracle is, predictably, going to focus on the ability to do transaction processing and analytics on the same machine at the same time as a key differentiator.
The message from us when looking at all of these numbers is the same as you will always hear from us: These metrics and measures are a good place to start, but you always have to do your own benchmarking and your own economic analysis based on your own applications and the configurations needed to run them well. None of this is ever as cut and dry as it looks in the benchmarks.
Regarding your comment: “It is a bit of a wonder why Oracle brought up these results” related to the tpmC/core chart (which I believe is based on the OpenSource HammerDB test which uses TPC-C workload), I think the subtle point is that regardless of the cost comparison of the two HW systems, (which is quite negligible actually), a 1.6x advantage in tpmC/core translates to a 1.6x performance/Oracle Database license and if you consider that the list price of an Oracle license is $47,500, even with a 50% discount, ~$23,750 per license, an SPARC S7-2 system with 16 x cores requires an Intel system with 67% more cores, (~26 x Xeon E5-2699 v4 cores @ 2.2 GHz) to compete, and therefore an additional 10 x cores which comes to 5 licenses x $23,750, or $118,750 in license cost savings PER SYSTEM, which easily pays off when multiple systems are needed. So the x86 system could be given for free to customer and would still cost more to run Oracle DB than the equivalent SPARC S7-2 system.
To be more precise in my language, it is a wonder that Oracle doesn’t show this in its charts, that it doesn’t add in the software costs and the hardware costs to show the cost per unit of work is actually lower once software costs are fully burdened. Per core performance is interesting in its own right, mind you, but people buy systems, not cores, and databases can use threads very well so more cores is not a bad thing unless the vendor prices based on a core. If the pricing were based on a socket, or if we are talking about open source code, this is a different story entirely. And, the extra hardware cost Oracle is charging is, in effect, a software tax. The point is, show the full cost of the systems at a detailed level, including system throughput and latency on the tests.
I wonder why Oracle is not comparing 8-core s7-2 with 8-core E5-2667V4..
Oracle published a broad range of benchmark results on the SPARC S7 that can be seen here: https://blogs.oracle.com/BestPerf/ however, theres only SPEC CPU results on the E5-2667 V4 so not really possible to do any real comparisons. SPEC CPU doesn’t test Database, MW, Java, in-memory analytics, big data, nor really represents any realistic, current workload.
Igor you are right. It is the main point, and it makes this comparison useless. Most databases don’t use so many cores, and few companies have the necessary cash to buy this amount of licenses.
Just to clarify this sentence – >
“(Although technically, the S824 is using two six-core chips in a single socket. The point is that it is a two-socket machine, not a four-socket one as you might think from the labeling.)”
4 which you are thinking that it is label for 4-socket is not regarding to socket it is regarding to high of server , so in this case 4 means that it is 4U server, for example Model S822 is 2-socket 2U server, S824 is 2 socket 4U server.
With the per core price of the Oracle database license, few people would use the E5-2699v4 CPU @ 2.2 Ghz, versus the E5-2667v4 CPU @ 3.2 GHz. Rather ironic how the benchmarks are normalized to per core measures, while they chose a low-frequency, max core count CPU.
The S7 CPU setup is 2×8 core, and should be compared to 2×8 core E5-v4.
Most databases systems are not good at spreading a single query over more than one core, often in real life I have seen system where users are waiting minutes for result that require sequel running of large database query, yet there are only 10s of power users on the system. Therefore per-core speed is important.
So, from all the results it seems that the POWER8 is the best performing, right?
It looks that way to me, especially on the per core basis in the last chart. The power 8 also has NVlink and the new Power 9 is 14 nm which should increase speed and reduce electrical load.