Arm-based servers have had a somewhat checkered history that has seen many abortive attempts to challenge the X86 processor hegemony, but the firm appears bullish about its chances in the high performance computing (HPC) sector, where it believes its licensing model and the energy efficiency of its architecture give it an edge.
Speaking at an HPC community event hosted by Dell, Arm’s senior director for the HPC business, Brent Gorda, said that the company is “really driving hard in the HPC community” and highlighted its partnership with companies such as Nvidia, Silicon Pearl, and Fujitsu to develop Arm-based silicon to drive HPC and AI applications.
In fact, Gorda pointed out that Arm has already secured a place in the HPC industry, right at the top with the “Fugaku” supercomputer at RIKEN Laboratory in Japan that has been ranked as the fastest in the world with its 152,064 48-core Fujitsu A64FX processors.
However, Fujitsu followed the path of acquiring an architecture license from Arm, which meant that it was able to design and manufacture its own custom Arm-compatible processor pretty much to suit its own requirements. This meant the addition of 512-bit Scalable Vector Engine units to support the kind of calculations Fujitsu had in mind, plus its own Tofu D high-speed interconnect.
But few HPC sites can design their own chip from scratch. Fortunately, Arm’s business model also lets partners take a ready-made core design and add custom modules to it, Gorda explained.
“There’s something called a core license whereby you can license Arm Neoverse, which is our IP. And that gives you the core building blocks, the logic itself, around which you customize and build the chip that you want to build,” he said.
Surrounding all this is the Arm ServerReady compliance program, which certifies that a specific chip meets compatibility requirements for the Arm server ecosystem.
“Once you pass this certification, the software world is available to you. It guarantees functionality for the software, and you can then pay for supported OS releases like Red Hat.”
This ability to customize the chip for a specific application or set of applications is where Arm has an advantage, Gorda claimed, especially with where HPC and AI appears to be heading. Customers can take the Arm core engine plus the on-chip network, and add custom accelerators for their target workload.
“Bill Dally from Nvidia will say you can get three orders of magnitude performance improvement by putting custom gates down on your silicon chip. That plays exactly to where Arm is going,” he said. “Everybody’s got an idea for an accelerator. And if you know your workload well enough, you can optimize that and just get crazy good performance. And in fact, that’s the reason why the A64FX is so good. They took ten years, they studied the ten or twelve applications that they had, and they nailed it. The processor came out and it just completely nailed the applications that the Japanese wanted on their system.”
Arm launched its Neoverse effort back in 2018 to target datacenter infrastructure rather than the mobile device market. The Neoverse designs were expanded last year and now comprises three families of processor designs: the V series, which emphasizes performance; the N series, which is focused on scale-out applications such as cloud infrastructure; and the E series, which is targeted more at edge applications.
Silicon Pearl, the company involved with the European Processor Initiative (EPI) project is using the Neoverse V1 design, Gorda disclosed. Meanwhile, the N1 design has been used in the “Quicksilver” and Mystique” Altra server chips from Ampere Computing, the startup founded by former Intel executive Renée James. Amazon’s Graviton2 chip that powers some AWS EC2 instances uses the N1 core, and the Graviton3 uses the V1 core. Neoverse V series cores also apparently feature in Nvidia’s planned “Grace” chip aimed at supercomputing, and in a server chip being developed by South Korea’s Electronics and Telecommunications Institute (ETRI).
One of the issues that has hindered Arm in the server market is software support, with many key software packages developed for X86 processor platforms. When asked if all the pieces are now in place to deploy HPC on Arm, Gorda said that in general, the answer is yes.
“The place where you will find some softness is that, while I believe it’s accurate to say all of the ISVs have an Arm port in progress, not all of the ISVs are publicly supporting Arm in silicon just yet. So if you’re dependent on ISV licenses and software, you will have to poll your ISV to understand the status of things,” he explained.
However, Gorda cited the example of the Lustre parallel file system, widely used in HPC environments. There has been Arm support for the Lustre client for many years, but “there are very few Arm-based storage servers,” Gorda says, and so Lustre server components for Arm are not currently supported by Whamcloud, the division within DDN that oversees Lustre development. This is something Arm is trying to address, he added.
Gorda also pointed out that Arm acquired Allinea Software, a leading provider of software tools for HPC, about five years ago, in order to bolster Arm’s HPC software ecosystem support.
Another factor in Arm’s favor is greater power efficiency, according to Gorda. This is something that may become more important as supercomputers expand into exascale territory and ongoing energy costs become a greater concern for HPC operators. Although Arm’s Neoverse V architecture emphasizes performance rather than power efficiency, the chips based on it still consume less energy than rival X86 processors, according to Gorda.
“The X86 guarantee is that you can run a 286 binary on it, and all of that historical legacy of being a CISC architecture with a RISC underlying it calls for a whole lot of logic up front in decode reordering, fixing up instructions, all of that is overhead that goes into the chip and consumes energy,” he said. In contrast, you can think of Arm as a clean sheet of paper, to some extent.
Gorda also claimed that end users no longer care what silicon their software is running on, citing the adoption of Arm-powered cloud servers by the likes of AWS.
“There’s new big players in town that get to control the architecture. And the things they care about are different than what historically has been cared about. They care about the cost, they care about the energy consumption, they care about turnaround time, and the software stack running on top of things,” he said.
“If you take a look at what Amazon’s doing with the Graviton2, they talk about it being 40 per cent cheaper. From an end user’s perspective, they don’t care what the silicon is, they care that it’s 40 per cent cheaper, and that the turnaround time is on par with what they’re used to.”
Earl Joseph, CEO of HPC analyst firm Hyperion Research, said that he expects to see high growth of HPC servers based on Arm processors over the next several years.
“We expect that to see a five year growth rate of over 31 percent, while the base market moves at around 7 percent to 8 percent,” Joseph said. That would equate to Arm-based systems accounting for about 10 percent of the HPC market by 2025, he added.
However, Joseph also cautioned that the revenue numbers can be misleading, as massive supercomputer projects can skew the figures, as the close to $1 billion Fugaku system did in 2020.
The market can thus shift dramatically due to such large individual installations, and Hyperion Research said it anticipates two European exascale machines based on Arm processors in 2025.
Many forthcoming HPC systems are expected to feature a mix of processors, Arm and X86 as well as other processor types, he added.