AMD Says EPYC Is A More Universal Hybrid Cloud Substrate Than Arm

SPONSORED FEATURE  The hyperscalers and big cloud builders have their own technical and political reasons for designing proprietary Arm server chips and having them built by Taiwan Semiconductor Manufacturing Co. But that does not necessarily mean that these Arm CPUs are the best choice for cloud customers, according to AMD, which says this is particularly true for those who are trying to go hybrid on the cloud to mitigate risk and foment competition while also avoiding vendor lock-in.

We sat down with Madhu Rangarajan, corporate vice president of product management, planning, and marketing for the Server Solutions Group at AMD, to take on what the company thinks some of the myths and misconceptions that are out there on the cloud when it comes to Arm-based processors.

Rangarajan is no stranger to the server market, having spent more than a decade at server maker Dell Technologies as a firmware engineer, more than a decade after that as an architect and senior principal engineer at Intel for server chipsets and cloud platforms, and vice president of products at Arm server CPU upstart Ampere Computing, which is in the process of being acquired by Japanese conglomerate Softbank for $6.5 billion.

It has taken the better part of a decade and a half for Arm to get an appreciable share of the server CPU market, but ironically the way it has happened has created as many problems as it has solved, says Rangarajan.

The first problem with Arm instances according to Rangarajan is one of fragmentation. Arm implementations vary significantly across the cloud builders, and that creates a fragmentation challenge that doesn’t exist with cloud instances based on AMD EPYC CPUs. Different Arm processors use various ISA versions, and they choose different Neoverse core types (plain vanilla N series or vector-enhanced V series) as well as having different cache hierarchies and frequency targets.

This diversity forces customers to optimize applications for specific Arm implementations, eliminating the universal substrate advantage that X86 provides and that Arm could provide if there was just one server CPU design.

“What’s the cost of fragmenting the fleet?” Rangarajan asks rhetorically. “Do you have a multi-cloud strategy, and if you go down that path – if you don’t have a multi-cloud strategy – what does that do to your pricing and costs over the long term? What about on-prem hybrid cloud strategies? Does this enable a hybrid strategy? So essentially, those are the other factors to think about cost beyond just instance cost. Instance cost ends up being a fairly small percentage of the overall budget.”

Moreover, aside from Ampere Computing, there really is no enterprise-grade Arm CPU for the on premises datacenter – although Nvidia is certainly selling its “Grace” CG100 CPUs for host processors in AI clusters – and these CPUs are very different from the homegrown Arm CPUs that the cloud builders have created.

The net result, says Rangarajan, is that application and systems software porting, testing, and validation expenses can be a lot higher in multi-cloud and hybrid cloud environments, and these factors often outweigh initial pricing advantages that Arm instances have compared to AMD EPYC processor instances.

Another big issue is the fact that enterprises still have a slew of applications running on Microsoft’s Windows Server platform, but the Arm instances do not support Windows Server, only various flavors of Linux.

By contrast, says AMD, all of the big clouds as well as the hyperscalers have AMD EPYC processors in their fleets and similar configurations of those processors, and the major OEMs and ODMs are also able to make machines using AMD EPYC CPUs for on-premises use, allowing for a universal computing substrate that spans all hybrid cloud architectures while at the same time reducing the complexity of compute across on premises and cloud infrastructure.

The issue is more nuanced than costs, of course, and to learn more about how AMD thinks EPYC CPUs are better than a cadre of Arm chips, you have to listen in. We hope you enjoy the conversation.

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

5 Comments

  1. Interesting interview! I guess the Phoronix benchmarks referred to were those (from the end of last year), for EPYC vs Neoverse N, and Neoverse V:

    https://www.phoronix.com/review/amd-epyc-9965-ampereone
    https://www.phoronix.com/review/nvidia-grace-epyc-turin

    The work that AMD has done to improve performance per watt is impressive and so is reaching 5 GHz clocks. As noted in interview this was surely motivated by increasing competition from ARM and possibly others, which is how we hope tech keeps evolving, away from monopolies and cartels.

    I think the AMD/x86 advantage (at present) comes from its broader software ecosystem and better standardization that (among others) helps prevent lock-in into one cloud environment or another, making workload migration easier when needed, with predictable performance. I guess ARM will get there too, in time, but AMD/x86 has the advantage there atm, imho.

  2. I love El Capitan’s MI300As (Top500 #1) with their modular chiplet design and ability to execute both complex branching kung-fu code and linear-algebraic matrix-vector karate. I hope we get to see more displays of this sort of exciting action-packed computing going forward (and rodeos!)!

  3. I understand why you have sponsored articles, but please include more than marketing fluff. Maybe some cost & performance numbers? All this porting / testing / validation FUD is not relevant to most programmers these days – they’re using Python / Java / Javascript. I’ve had no issues moving between x86 and ARM, even the C++ extensions to Python compile and run on MacOS or Linux.

Comments are closed.