Real-World HPC Gets the Benchmark It Deserves

While nothing can beat the notoriety of the long-standing LINPACK benchmark, the metric by which supercomputer performance is gauged, there is ample room for a more practical measure. It might not garner the same mainstream headlines as the Top 500 list of the world’s largest systems, but a new benchmark may fill in the gaps between real-world versus theoretical peak compute performance.

The reason this new high performance computing (HPC) benchmark can come out of the gate with immediate legitimacy is because it is from the Standard Performance Evaluation Corporation (SPEC) organization, which has been delivering system benchmark suites since the late 1980s. And the reason it is big news today is because the time is right for a more functional, real-world measure, especially one that can adequately address the range of architectures and changes in HPC (from various accelerators to new steps toward mixed precision, for example).

SPEC has history with HPC-specific benchmarks via the SPEC HPC2002 and SPEC HPC96 benchmark suites, which as you might guess by the name, have long-since been retired. SPEC has some current HPC specific benchmarking efforts around MPI and its SPEC ACCEL suite, the latter of which focuses on OpenCL, OpenACC, and OpenMP. This new effort, the SPEChpc 2021 Benchmark Suite, is the result of collective efforts at several national labs and universities in Europe as well as vendor collaborators, including AMD, Intel, Bull Atos, Lenovo, and Nvidia.

SPEC says while previous SPEC/HPG benchmarks focused on a single parallel model, MPI, OpenMP, or OpenACC, SPEChpc combines these into a single comprehensive set of suites that can use MPI, MPI+OpenMP, or MPI+OpenACC. This allows users to better select the appropriate model for their system and allow for comparisons across a wider variety of systems. It adds that it may retire some of the older HPC SPEC benchmarks in the future.

The real-world application focus is important as it is in high demand by an HPC ecosystem dominated by LINPACK, which does not fully represent actual application patterns in some ways. While the Top 500 benchmark creators have built add-ons over the years, including HPCG and HPC-AI to fill in these gaps, for those who need a benchmark to start down the system buying road, that is not as useful as what SPEC is putting forth for 2021. In short, HPC needs modern benchmarks that represent real codes and can provide early direction about what they should buy, and most important, what it might cost in terms of capex and opex.

The SPEChpc 2021 suite includes a broad swath of science and engineering codes that are representative (and portable ) across much of what we see in HPC.

  • A tested set of benchmarks with performance measurement and validation built into the test harness.
  • Benchmarks include full and mini applications covering a wide range of scientific domains and Fortran/C/C++ programming languages.
  • Comprehensive support for multiple programming models, including MPI, MPI+OpenACC, MPI+OpenMP, and MPI+OpenMP with target offload.
  • Support for most major compilers, MPI libraries, and different flavors of Linux operating systems.
  • Four suites, Tiny, Small, Medium, and Large, with increasing workload sizes, allows for appropriate evaluation of different-sized HPC systems, ranging from a single node to many thousands of nodes.

With the SPEChpc 2021 Benchmark Suites, developers and researchers can evaluate different programming models to assess which model would be best for their application or system configuration. Hardware and software vendors can use it to stress test their solutions. And compiler vendors can use it to improve general code performance and their support for directive-based programming models. The new suites can also be used by datacenter operators and other end users to make procurement decisions.

The goal is to measure parallel capabilities on one or multiple nodes, factoring in everything from the CPU, accelerators, memory performance and bandwidth, interconnect performance as well as compiler and MPI performance of a given implementation. The point is that the processor is just one element — it’s about measuring a system in practice and in balance. It does not weigh in some other factors, including I/O performance, but there are other SPEC benchmarks for that as well as for measuring the performance of things like Java libraries, for instance.

And here’s the beautiful thing: SPEC already has results for tiny to large runs from both vendors and partner organizations, running on large supercomputers like Summit and Frontera. See below for the SPEChpc Large results on Summit, Frontera, and a smaller cluster (comparatively, Taurus). There is more detail at the SPEC site linked above.

The nine benchmarks (organized into four suites by workload size, the Tiny to Large designation) capture a wide range of HPC applications, including Lattice Boltzmann, Monte Carlo, CFD and other representative codes (LBM D2Q37; SOMA (with Monte Carlo acceleration); Tealeaf and Cloverleaf (widely used heat diffusion and hydrodynamics mini-apps which are portable across most contemporary architectures and a wide range of scales); Minisweep; POT3D; SPH-EXA (astronomy/cosmology/hydrodynamics); miniWeather; and HPGMG-FV.

In creating SPEChpc 2021 the committee considered:

  • Representativeness: Is it a well-known application or application area?
  • Availability of workloads that represent real problems.
  • Performance profile: Is the candidate compute bound, spending most of its time in the benchmark source, and little time in IO and system services?
  • Portability to a variety of CPU architectures, including Arm, Power ISA, and x86.
  • Ability to be ported to all node-level parallel models.
  • Scaling within a node as well as across nodes.

SPEC is a reference machine to normalize the performance metrics, the TU Dresden’s Taurus System using the Haswell CPU Islands. Each node contains a 2-socket 12-core Haswell (24 cores total) with 64GB of memory. Tiny reference time uses 24 ranks on a single node. Small uses 10 nodes (240 ranks), Medium use 85 nodes (2040 ranks), and Large use 340 nodes (8160 ranks).

“Building on our experience in developing the SPEC MPI 2007 benchmark, the SPEC OMP 2012 benchmark, and the SPEC ACCEL benchmark suites, SPEC designed a new set of benchmark suites that keeps pace with the rapidly evolving HPC market,” said Ron Lieberman, SPEC High Performance Group (HPG) Chair. “The high portability of the SPEChpc 2021 Benchmark Suites, along with a strict result review process and rich SPEC result repository, enables us to deliver vendor-neutral performance comparisons for evaluating and studying modern HPC platforms.”

The SPEChpc 2021 Benchmark Suites are available for immediate download under a two-tiered pricing structure: free for non-profit and educational organizations and $2,500 for sellers of computer-related products and services. SPEC HPG members receive benchmark licenses as a membership benefit.

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

Be the first to comment

Leave a Reply

Your email address will not be published.


This site uses Akismet to reduce spam. Learn how your comment data is processed.