Making HCI Hay While the Sun Shines

Whenever enterprise server buyers are shopping, they are not just comparing the possible options on the market today against each other. That is important, and we will consider this for a particular workload in the enterprise as we pit AMD EPYC processors against Intel Xeon SP processors. But equally important is the comparison of the new machinery to the older iron in the server fleet that is getting long in the tooth after four or five years of faithful service – sometimes even longer.

With compute capacity needs always on the rise, the strongest economic arguments are often those that show how much server consolidation can be done now on machines that were installed way back then with their own good economic arguments in their own time. Times change, and so do the price/performance curves. And this time, the curves for cloud, HPC, and now enterprise workloads are most definitely bending in favor of the AMD “Rome” second generation EPYC processors.

Before getting into the feeds and speeds, slots and watts, and dollars and sense of the AMD EPYC processors in the enterprise, it is helpful to recall the natural progression of any new – and renewed – architecture in the datacenter.

Those organizations with the most pressing needs are the ones who go first, and back in the early 2000s, it was HPC centers and then the early hyperscalers who were the strongest buyers of AMD Opteron server processors. Both tended to use what were essentially custom machinery for their workloads, and having control of their hardware manufacturing (to a certain extent for HPC centers and to a large extent for hyperscalers and early public clouds) and their software stacks allowed them to be first movers. No one would call the enterprise sector as a whole a fast follower, but enterprises with large scale and continual budget pressures tend to watch what the hyperscalers and HPC centers do and, these days, what the cloud builders do as well.

Like it or not, it always takes time for the original equipment manufacturers to case out the much more diverse needs of their hundreds of thousands of customers and their hundreds to thousands of workloads. This support matrix across the enterprise is broader and deeper, and if anything slows down the adoption of any new technology, it is the need to pick vendors and server configurations and then qualify a cornucopia of applications for the new platforms. This is not something that takes weeks, but many months at many large organizations. Once the task is done, then buying more of the same technology or a follow-on is easier – particularly, in the case of processors, if they are socket compatible with existing platforms. This is why we see X86 servers that can have two or sometimes three generations of processors in them. No enterprise can afford to go through this qualification process too many times in terms of either time or money.

But at some point, the risks and the rewards of moving to a new technology don’t balance, and the risks are relatively small and the rewards are tremendous, and the enterprise market quickly swings on the pivot to broadly adopt a new technology. This is how the Opteron attained around 25 percent shipment market share in the HPC centers of the world and then in the enterprise in the early to middle 2000s, and it is also how AMD is getting traction among enterprises right now against current “Cascade Lake” Xeon SP systems and how it will continue to compete against future “Ice Lake” Xeon SPs coming down the pike.

“If you go to that notion of risk in the enterprise, I think our take is that it is risky not to deploy EPYC,” Ram Peddibhotla, corporate vice president of product management for datacenter products at AMD, tells The Next Platform. “When you look at the competitive dynamics of performance per dollar and performance per user value that EPYC brings, it is just an incredible tail wind that enterprises can give themselves by adopting EPYC – and not doing so is where the risk actually is.”

Peddibhotla is no stranger to the enterprise, or to the rise of new technologies. He spent 18 years at Intel, managing its Linux vendor relationships during the dot-com boom and then became director of Intel’s entire open source business until the Great Recession, when he began to focus on hyperscalers and cloud builders for six years – years of their most aggressive expansion, we would point out. After that, in 2015, Peddibhotla moved to Qualcomm to drive its Arm server CPU effort (which it spiked because it had other, much bigger issues) and came to AMD last year to drive its datacenter products and ecosystem.

That brings us to the killer enterprise application of hyperconverged infrastructure, or HCI. With hyperconverged infrastructure, a cluster of systems has both virtualized compute and virtualized storage area networking (SAN) block storage running in concert on the same cluster, obviating the need for distinct, separate, and more expensive physical SAN appliances that were popular for two decades in the enterprise. HCI eliminates all kinds of costs compared to buying big iron servers and big SANs linked to each other by Fibre Channel switches. HCI delivers comparable performance while reducing complexity and providing a lot more flexibility in scaling both compute and storage. It is no wonder, then, that HCI is a key workload for the enterprise and is also a place where AMD and its server partners providing HCI solutions or appliances are seeing solid uptake.

AMD and its HCI partners are chasing two different classes of enterprise customers and are using VMware’s combined ESXi and vSphere server virtualization and its vSAN virtual storage and VMware’s VMmark 3.1 benchmark test to talk about how AMD EPYC stacks up against Intel Xeon SPs. One optimizes for bringing the absolute highest density of VMs to bear, using the high-end EPYC 7742 and the other optimizes for the lowest possible total cost of ownership per VM.

Let’s look at the low TCO optimization first:

In the comparison above, each Dell PowerEdge R6525 tested on the VMmark test, which you can see here, is equipped with a pair of EPYC 7F72 processors, which in turn each have 24 cores running at a base 3.2 GHz clock speed and 1 TB of main memory. This baby cluster, which is equipped with switches and a management server, supported 266 VMs and achieved a VMmark score of 13.7 across 14 tiles.

The Intel system that AMD compares against is a Unified Compute Platform HC system from Hitachi Vantara, comprised of four Advanced Server DS120 systems, each with a pair of Intel “Cascade Lake” Xeon SP 8276L Platinum processors, each with 28 cores running at a base 2.2 GHz. Each node has 384 GB of main memory, which seems a bit skinny. In any event, the four-node cluster of Cascade Lake servers achieves a VMmark 3.1 score of 9.0 across 9 tiles with a total of 177 VMs.

So the AMD EPYC setup above had a 47 percent higher VMmark 3.1 score and supported 56 percent more VMs as the Intel Xeon SP setup. What we don’t know is the relative price/performance of the two stacks of iron, but we do know that the AMD machines were heavier on memory and we also know that at list price  the EPYC 7F72 costs $2,450 a pop when bought in 1,000-unit trays and that the Xeon SP 8276L costs $8,179 a pop when bought in 1,000-unit trays. The odds favor a very substantial price/performance advantage for AMD here, but VMware does not require pricing for VMmark submissions and it is onerous to try to get pricing for these precise configurations. So, we have to infer this based on the substantial difference in processor prices and performance between AMD and Intel. (We should not have to infer anything. Pricing information should always be a part of a system benchmark, and VMware can do a better job by adding pricing information to VMmark.)

Now the following comparison that AMD does takes density up another notch and, presumably because these are much beefier machines, they are also more costly and there is a slightly higher cost per VM for them. The processors are more expensive in this comparison, but look at how a four-node EPYC machine can stand up to a four-node Xeon SP machine – and an eight-node one as well.

In this VM density comparison – density means the number of VMs on the clustered system, not how many physical servers are packed into a space in a rack as we typically think of that term – the Dell PowerEdge R6525 systems were configured with a quad of two-socket server sleds equipped with a pair of 64-core EPYC 7742 processors spinning at 2.25 GHz and 2 TB of main memory. This set of four nodes running the VMware stack plus vSAN had a VMmark rating of 24.08 and ran across 28 tiles, supporting 532VMs. (You can see the benchmark test result at this link.)

This system was put up against a Synergy 480 Gen 10 machine tested by Hewlett Packard Enterprise running the VMware HCI stack, tested way back in November 2018 admittedly using the 18-core Xeon SP 6140 Gold processors, which spin at 2.3 GHz. This machine, which you can see the results for here, was able to get a VMmark of 15.25 across those eight nodes, supporting 16 tiles and 304 VMs. What is interesting is that they also tested a 16-node configuration of VMware’s HCI stack, which you can see here, that was able to double up the VMmark performance to 30.13 across 32 tiles, and a 24-node cluster of Synergy 480 Gen10 systems could push up to a VMmark of 46.10 across 48 tiles, which you can see here. This only makes sense if you can use the less expensive Xeon SP 6000 series processors and if you don’t care about server sprawl, which customers definitely care about because of the high capex and opex costs associated with it. If you went with high-end (but not top bin) Cascade Lake Xeon SP 8276L Platinum processors used in the first comparison that AMD put together, we estimate that you would need at least 11 nodes to match the performance of the four nodes that AMD needed. That’s a factor of 2.75X less density than AMD EPYC processors can offer. And the future “Ice Lake” Xeon SPs, looming for launch later this year, are expected to stay stuck at 28 cores for the top-bin part (we shall see.) Even with expected IPC improvements on the order of  20 percent compared to Cascade Lake machines, that still only drops it down to a little more than nine nodes compared to the four nodes for the Dell R6525 setup.

This, of course, compares current servers with current servers, more or less. If you go back to the “Haswell” and “Broadwell” Xeon E5 and maybe even some “Skylake” Xeon SP machines, the consolidation ratio advantage that the AMD “Rome” EPYC processor systems can offer to HCI customers is even more compelling. The datacenter will have lots of room for future expansion once this old iron is taken out – that’s for sure.

The sun is shining in “Rome,” and this is a great time for AMD’s partners to make some HCI hay. And based on the strength of AMD’s performance in “Naples” and its expected consistent performance in “Milan” and in “Genoa,” the HCI sun will be shining there, too.

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

Be the first to comment

Leave a Reply

Your email address will not be published.


This site uses Akismet to reduce spam. Learn how your comment data is processed.