Are ARM Virtualization Woes Overstated?

Nicole Hemsoth Prickett

8 years ago

As we have seen with gathering force, ARM is making a solid bid for datacenters of the future. However, a key feature of many serer farms that will be looking exploit the energy efficiency benefits of 64-bit ARM is the ability to maintain performance in a virtualized environment.

Neither X86 or ARM were built with virtualization in mind, which meant an uphill battle for Intel to build hardware support for hypervisors into its chips. VMware led the charge here beginning in the late 1990s, and over time, Intel made it its business to ensure an ability to support several different hypervisors. For ARM, the challenges of virtualization across several hypervisors are still relatively new (compared to Intel’s early start in this area), but there has been a great deal of work to ensure comparable virtualization performance to X86 over the last few years.

According to Christoffer Dall, an early pioneer in virtualization research with an emphasis on the ARM architecture, everything you’ve heard about inferior virtualized performance on ARM should be taken with a grain of salt, especially when it comes to KVM. While there is some validity to KVM’s reputation on ARM as lackluster, this is not a generalizable truth–it is more nuanced, at least as some benchmarks have shown.

ARM builds their virtualization support in hardware similar to X86, but the way these approaches are architected are quite different. “ARM virtualization support for running hypervisors is designed to favor one hypervisor over another and ARM clearly favors the Xen way of doing things. It’s almost like it was built for Xen specifically,” he says. Getting KVM to work on ARM was a major hurdle, Dall adds, and even with the additions for virtualization support in ARM v.8, one still has to look at the performance between hypervisors.

With that said, Dall says many people have the vague opinion that ARM doesn’t perform as well with hypervisors as Intel, but in recent research highlighting these differences, it becomes clear that there are some obvious bottlenecks that can and have been broken. For instance, the biggest bottleneck, which can most simply be described as exiting a virtual machine (a common operation) has been where much of the performance overhead has been incurred. This transition time can vary 4X faster or slower depending on the architecture/hypervisor combination and at scale, can lead to massive traffic jams performance-wise when such transitions are part of a workload.

Dall is now at Columbia University where he is keeping with his research focus following stints at VMware (and now as the technical lead for Linaro, an ARM industry group). He is also the original author and current maintainer of KVM on ARM efforts. In the absence of any objective analysis of real performance of ARM versus X86 on various hypervisors, Dall and his Columbia team have published benchmark and other results to highlight how ARM is not as far behind on the virtualization support front than popular opinion might suggest.

The team ran micro-benchmarks and application benchmarks on both Xen and KVM on X86 and ARM using the University of Utah’s CloudLab (leveraging 64-bit ARM based HP Moonshot m400 nodes) and a wide variety of X86 nodes. The results are based on ARM and Xeon 2.1 GHz ES-2450 CPUs with similar RAM, disk, network configurations and specs. The benchmarks were designed to measure that all-important exiting VM operation, or the time that must be spent outside the VM and not running the workload inside it. Using a custom Linux kernel driver, which ran in the VM under both hypervisors on both architectures the seven benchmarks above produced the results shown.

Among the team’s peer-reviewed findings (noteworthy because of Dall’s investment in KVM on ARM) they found that the overall hypercall (the bottleneck-inducing transition/overhead) of KVM on ARM cost 6,500 cycles while Xen on ARM cost only 376 cycles. Before that raises too many eyebrows, remember that there are new improvements to the ARM architecture, including the Virtualization Host Extensions (VHE) which might allow KVM (and similarly built hypervisors) to lower that cost.

“ARM can be four times faster than Intel for Xeon but up to four times slower for KVM, which is extreme, or that is the intuitive premise everyone had been describing for ARM hypervisor performance,” Dall says. “We have concluded that in the real world, with actual application benchmarks, it doesn’t look like this at all–in these benchmarks, KVM on ARM turns out to be faster than Xen on ARM. Overall ARM and X86 are on par with each other in terms of virtualization overhead,” Dall notes.