The idea of ARM processors being used in datacenter servers has been kicking around more most of the decade. The low-power architecture dominates the mobile world of smartphones and tablets as well as embedded IoT devices, and with datacenters increasingly consuming more power and generating more heat, the idea of using highly efficient ARM chips in IT infrastructure systems gained steam.
That was furthered by the rise of cloud computing environments and hyperscale datacenters, which can be packed with tens of thousands of small servers running massive numbers of workloads. The thought of using ARM-based server chips that are more energy-efficient than their Intel Xeon counterparts to run all these servers was enticing.
But despite all the talk, ARM’s path into the datacenter has been bumpy. Calxeda was early to the party, but ran out of money and had to shut its doors. Others, such as Samsung and – it looks like – Broadcom (following the massive $37 billion merger with Avago), have pulled back on plans to manufacture ARM server chips. Broadcom was expected to get its “Vulcan” chip into the market by 2015. Those that have pushed out ARM chips – including AMD, Applied Micro and Cavium – have not seen widespread adoption of their products.
As we’ve discussed, Qualcomm seems to be the most aggressive in its intent to bring out an ARM server chip that can compete with Xeon processors and chip away at Intel’s dominance in the space (Intel holds about 97 percent of the server chip market). When ARM officials first began talking about putting their architecture into datacenter servers, it was eyeing an Intel that was struggling to reduce the power consumption of its x86 chips. ARM seemed a natural alternative.
However, the playing field has changed in recent years. Intel has improved the energy efficiency of its Xeons, and other players – in particular IBM, with its OpenPower effort and AMD with its upcoming x86 Xen chips – are also working to become another option for businesses that are looking for second source of silicon to not only drive down prices and fuel innovation through competition, but also to protect themselves in the event of supply chain problems.
Still, ARM officials have seen some momentum behind their efforts. Fujitsu last year announced it was ditching the SPARC architecture in favor of 64-bit ARMv8-A SoCs for the next generation of its K supercomputer, which is the seventh-fastest system in the world, according to the Top500 list. The goal is to improve the performance-per-watt of the new system. Once operation, the Post-K supercomputer will be an exascale system 100 times faster than the current system. More recently, Qualcomm this month announced a joint venture with China’s Guizhou province named Huaxintong Semiconductor Technology, which is developing an ARM-based server chip for the Chinese market. In addition, the Mont-Blanc Project in Europe is working with Cavium and system-maker Bull—owned by Atos—to build a prototype exascale computer using Cavium’s ThunderX2 ARM-based SoCs.
The rise of mobile and cloud computing and the growth of infrastructure-as-a-service (IaaS) also hold out hope for energy-efficient architectures like ARM. It is at this intersection in the rapidly evolving IT ecosystem landscape that two researchers from India recently tested the 64-bit ARM architecture against an x86 chip in running data analytics workloads. Jayanth Kalyanasundaram and Yogesh Simmhan from the Department of Computational and Data Sciences Indian Institute of Science in Bangalore ran tests pitting a server powered by AMD’s year-old ARM-based A1170 SoC and one based on the chipmaker’s x86-based Opteron 3380. In the past, there has been numerous studies of 32-bit ARM processors for various workloads, there had been no research around ARM64 chips and how they handle cloud-based applications, particularly big data workloads, the researchers wrote in their work titled “ARM Wrestling with Big Data: A Study of ARM64 and x64 Servers for Data Intensive Workloads.”
“Since energy consumption by servers forms the major fraction of the operational cost for cloud data centers, ARM64 with its lower energy footprint and server-grade memory addressing has started to become a viable platform for servers hosted by Cloud providers,” Kalyanasundaram and Simmhan wrote. “This is particularly compelling given that scale-out (rather than scale-up) workloads are common to Cloud applications, and the growing trend of containerization as opposed to virtualization.”
The two researchers used a SoftIron Overdrive 3000 server powered by an eight-core, 2GHz A1170 chip with 16GB of RAM, 1TB Seagate Barracuda HDD with a 64MB cache and Gigabit Ethernet. The system ran an OpeSUSE Linux distribution and a BTRFS file system. The x86 system was a single cluster node but with the 2.6GHz, eight-core Opteron 3380 processor. The server had a similar configuration—16GB of RAM, a 256 GB SSD for the operating system partition, the same Seagate 1TB HDD and Gigabit Ethernet. It ran the CentOS 7 Linux distribution, EXT4 file system for the SSD and BTRFS for the HDD. Both systems used the OpenJDK v7 compiled for x64 and ran Hadoop v2.7.3 in pseudo-distributed mode.
The tests used Intel’s HiBench Big Data benchmark suite running a variety of workloads for various benchmarks, from web search and hive query to machine learning and reducer parallelism tuning. The researchers also analyzed the energy efficiency of each system while running the various benchmarks.
The detailed findings can be found in the study, but the results are encouraging for cloud-based players considering ARM-based systems for their environments and for chip vendors developing ARM-based SoCs and are hoping the cloud will give them another avenue into the datacenter server market. According to the researchers, there was comparable performance between the two servers when running integer-based workloads and jobs with smaller floating-point sizes. The ARM server was dinged when running larger floating-point applications due to its slower floating-point unit (FPU) coprocessor. However, “with tuning Hadoop to expose data parallelism, the ARM64 server can come close to the performance of the x64 server, which is limited by having a faster FPU shared by pairs of cores,” they wrote.
As far as energy efficiency, the ARM server had a three-times smaller base power load than the x86 system, with a similar reduction in load when running the big data workloads. The ARM server also had similar benefits when looking at the energy-delay product (EDP) – which entails both compute performance and power efficiency – with a 50 to 71 percent advantage over the X64 system. The two researchers plan to expand the study to include a better understanding of disk IO performance and a deeper dive into the relative performances of the FPUs, as well as the impact of containerization and virtualization and how the systems run other big data workloads for stream processing and graph analytics.
The ARM architecture will continue to find the competition to become the preferred alternative to Intel in the datacenter a challenge. The best opportunity it had was several years ago, and the competitive landscape has grown since then. But studies like the one done by Kalyanasundaram and Simmhan will give cloud providers and hyperscale companies reasons to consider the architecture.