KAUST Goes For Nvidia CPU-GPU, AMD CPU Combo In Shaheen-III Supercomputer

The King Abdullah University of Science and Technology (KAUST) in Saudi Arabia has introduced the latest in its long-standing lineup of successive supercomputers with Shaheen III. The new machine is built by Hewlett Packard Enterprise and is set to provide a 20X peak potential boost to the center’s previous leadership-class system, Shaheen II, which was deployed seven years ago.

KAUST has kept a consistent naming construct for its largest systems, beginning with the first Shaheen supercomputer that can be dated by the mere mention of its architecture — the IBM Blue Gene/P.

Though that system, announced in 2009, took a backseat publicity-wise then to China’s entry into the top tiers of HPC with its Tianhe-1 machine, it was one of many IBM systems filling out the top system rankings. That year, IBM held the performance share of the Top 500 list of the world’s most powerful supercomputers, a number that has been steadily declining with HPE taking the system and performance share honors currently.

In 2015, KAUST, like many other supercomputing sites that had been long-standing IBM shops, made the transition to either HPE or Cray machines, often outfitted with GPU accelerators (before HPE absorbed Cray, of course). The Saudi supercomputer center was on board with the Cray trend, installing 36 of Cray’s liquid-cooled XC40 cabinets, but decided to forgo accelerators. Even as a CPU-only machine (Intel “Haswell” generation) the super took the seventh spot on the Top 500 list and continued to retain a spot in the Top 100 (#97) until June 2022 with just over 5.5 petaflops peak theoretical performance.

KAUST has decided to remain loyal to Cray, now part of HPE, with its latest Shaheen III system, which will be fully operational in 2023. The major divergence is a robust adoption of acceleration, although not in the PCIe acceleration world of 2015. The system will sport 2,800 Nvidia “Hopper” superchips, which combine Nvidia’s “Grace” Arm CPUs and Hopper GPUs across a converged fabric, allowing the center to more carefully explore where AI will fit into existing and future HPC workloads.

What is also notable is that the system could potentially reach 100 petaflops of peak performance based on the Linpack/Top 500 benchmark estimates.

Performance is gathered across 18 liquid-cooled HPE Cray EX cabinets with the obligatory HPE/Cray “Slingshot” network.

AMD Epyc “Genoa” processors form the CPU core the system with each cabinet housing 4,608 CPU compute nodes, with two AMD processors, amounting to 884,736 cores in the entire system. Seven HPE Cray EX4000 cabinets will include 704 GPU compute nodes, and each node will be equipped with four Nvidia Grace-Hopper superchips. KAUST has also expanded its storage infrastructure to accommodate so much new compute capability, adding 50 PB to its HPE Cray ClusterStor E1000 system.

As one might imagine given its location, KAUST is an important research center for oil and gas HPC work in addition to more standard scientific domains, including materials science and environmental modeling.

“A supercomputer like Shaheen III is a universal scientific instrument employed by scientists and engineers in every discipline for tasks such as simulation, analysis of experimental data, learning from observed data, and efficient data storage and retrieval,” said KAUST Extreme Research Computing Center (ECRC) Director Dr. David Keyes, professor of applied mathematics and computational science.

“It is the ultimate scientific ‘watering hole’ at which researchers of different disciplines exchange techniques and software tools. An advance in one field spurs advances in several,” Keyes adds.

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

Be the first to comment

Leave a Reply

Your email address will not be published.


This site uses Akismet to reduce spam. Learn how your comment data is processed.