Arm Smooths the Path for Porting HPC Apps

One of the arguments Intel officials and others have made against Arm’s push to get its silicon designs into the datacenter has been the burden it would mean for enterprises and organizations in the HPC field that would have to modify application codes to get their software to run on the Arm architecture.

For HPC organizations, that would mean moving the applications from the Intel-based and IBM systems that have dominated the space for years, a time-consuming and possibly costly process.

Arm officials over the years have acknowledged the challenge, but have noted their infrastructure’s embrace of open-source software and the work of such parties as Linaro, whose aims include to push the development of tools for the Arm architecture. They also have pointed out the continued development of systems-on-a-chip (SoCs) based on the 64-bit Armv8-A architecture from the likes of Cavium and Applied Micro, and now mobile chip giant Qualcomm, which just launched its 48-core, 10-nanometer Centriq 2400 SoC family that is aimed at hyperscalers and cloud providers. Having Qualcomm in the fold gives Arm and its datacenter efforts a company with the scale and resources to challenge Intel and IBM.

We at The Next Platform have followed Arm’s courting of HPC and the community’s interest in bringing the power-efficient Arm architecture into the fold, from the SoC design and acceptance by hardware makers to the application development work done with ecosystem partners. HPC organizations are dealing with growing demand for capacity and efficiency as they deal with workloads from machine learning and artificial intelligence to compute-intensive scientific workloads and continue the push to exascale computing, and officials with Arm and its partners see an opportunity for the Arm architecture. Cray will build the Arm-based Isambard supercomputer for a new HPC service based in the United Kingdom that will serve as a platform for scientific research, and Fujitsu last year announced it will use Armv8-based processors in the next generation of its K supercomputer, replacing the SPARC64 chips that power the current system at the Riken Advanced Institute for Computational Science in Japan.

So the interest is there, and the porting of workloads continues to be a key focus. Arm made a significant step in addressing the issue in late 2016 when it bought Allinea Software, which built tools for application development, debugging, and analysis for HPC systems. The acquisition followed other moves by Arm in the HPC space, from the unveiling of the Armv8-A Scalable Vector Extension to the release of its Performance Libraries and support of OpenHPC.

The chip designer this week followed that up with the launch of the Arm Allinea Studio, a collection of Arm-specific compiler and libraries that expands the suite of tools that HPC organizations can use for building and porting their applications to the Arm architecture. Company officials said they also will support the continued development of other Allinea tools, including Forge, DDT and MAP, for hardware-specific optimization across HPC and server platforms. The Allinea Studio offering should help address key questions HPC organizations have had about making the move to Arm, according to Patrick Wohlschlegel, senior engineering manager at Arm.

“In recent months, the HPC market has been waiting to see how Arm will drive innovation for High Performance Computing,” Wohlschlegel said this week. “Arm and its partners have been working hard to enable a greater variety of competitive hardware solutions, providing the innovation and technology choice that’s so desperately needed to solve next generation scientific problems and workloads. With Arm-based infrastructure hardware emerging from key partners such as Cavium and Qualcomm, the final step is making sure that the journey to the Arm architecture has a viable and straightforward porting path for users’ applications.”

With the Arm Allinea Studio, the “cross-platform interoperability of the Allinea debugger and profiler (fully supported on most HPC systems and architectures) is essential for making a transition to Arm as smooth as possible. As an HPC developer, you can keep using the tools you have come to know, trust and rely on for years, regardless of the vendor or system you choose. It’s all part of a plan to make porting your applications to Arm totally effortless and (dare we say) … boring!”

Among the tools that will make this migration boring is the Arm Fortran Compiler, which the company released as a beta in June. The compiler is now fully supported and commercially available with support for Fortran 2003 and prior standards. The platform also includes the Arm C/C++ Compiler –which is based on LLVM, supports C++ 14 and is turned for server and HPC workloads on Arm-based platforms – and Arm Performance Libraries with BLAS, LAPACK and FFT functionality for HPC software on Arm-based systems. In addition, Arm Forge – which had been Allinea Forge – now includes the Arm DDT bebugger and Arm MAP for debugging, profiling and optimizing applications, while Arm Performance Reports (formerly Allinea Performance Reports) will analyze applications for inefficiencies and optimization opportunities.

Wohlschlegel wrote that Arm also created a quick guide for optimizing the performance of applications running for the first time on the Arm architecture. The guide covers everything from the first steps (find out if the community already has done the work, analyze the performance on test cases with Performance Reports and note the compiler flags used) to doing the performance porting work using the Arm Compiler for HPC and Forge (tweak the optimization options, debug the program and identify bottlenecks using the MAP profiler).

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.

Subscribe now

Be the first to comment

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.