The Texas Advanced Computing Center at the University of Austin is the flagship datacenter for supercomputing for the US National Science Foundation, and so what TACC does – and doesn’t do – is a kind of bellwether for academic supercomputing. So it is noteworthy that TACC is installing its first hybrid CPU-GPU system intended to run production workloads.
We talked about the hybrid CPU-GPU machine, dubbed “Vista,” back in January, when we also looked at the new datacenter that TACC was having built to house the future “Horizon” system, which aimed to have at least 10X the application performance of the current “Frontera” all-CPU supercomputer. Frontera, which cost $60 million and was operational in September 2019, has 8,368 two-socket Xeon nodes with a total of 468,608 cores and a peak performance of 38.75 petaflops. Frontera is one of the largest all-CPU machines in the world, and it is a workhorse of unclassified science in the United States.
That Vista is moving to GPU accelerated compute is not a surprise – such moves are inevitable given the massive performance benefits of shifting calculations from CPUs to GPUs. But Vista will also pack a fair amount of all-CPU compute power, too, which means that end users that have not ported their HPC codes to GPUs will still be able to run their work on Vista. And the same will no doubt hold true with Horizon when that much more capacious system is installed sometime in 2025 and is operational in 2026.
Interestingly, last week, TACC announced that Sabey Data Centers, a datacenter operator based in Seattle, knocked down the Sears Teleserve building in the Austin suburb of Round Rock to build a new datacenter, shown below, that will house the future Horizon machine, which received $457 million in funding from the NSF to create the Leadership Class Computing Facility at TACC. Back in January, hosting site for Horizon was a massive datacenter known as The Rock, owned datacenter operator Switch, which had also expanded into Austin. Somewhere along the way, Switch was switched out.
In any event, the Vista machine is not at the Sabey facility, but is instead parked right next to the existing Stampede 2 machine at TACC’s own facility on the UT campus. As you can see:
There are two pieces to the Vista system. Well, three if you count storage. Four if you count networking.
First, there is the core of the cluster, which is comprised of 600 nodes using Nvidia’s hybrid superchips, NUMA compute complexes that have a single “Grace” CG100 Arm-based CPU lashed to a single “Hopper” GH100 GPU accelerator using 900 GB/sec NVLink-C2C ports. The Hopper portion of this machine represents most of the floating point oomph in the entire Vista system. All told, those Hopper GPUs deliver 20.4 petaflops of peak FP64 performance on their vector cores and 40.8 petaflops on their tensor cores. So this partition of Vista with 600 GPUs has 5.3 percent more peak flops than the Frontera machine with 16,756 CPUs. Those CPUs are five years old, but that is still a device compression ratio of 27.9X in favor of the GPUs.
On the Grace-Hopper node, the H100 GPU is configured with 96 GB of HBM3 memory and the C100 CPU is configured with 120 GB of LPDDR5 memory – less than a quarter of the maximum. The maximum LPDDR5 memory on the Grace-Grace superchip is 480 GB per CPU; the maximum bandwidth is 500 GB/sec per CPU. Hopefully TACC is getting full bandwidth even with a quarter of the memory. That C100 Arm processor in the Grace-Hopper superchip has 72 cores active and they run at 3.1 GHz.
The second part of Vista is a cluster of Grace-Grace superchips with 256 nodes in total. In this case, the Grace chips are linked together in a NUMA shared memory complex by NVLink ports and the CPUs run at a slightly higher 3.4 GHz. The TACC manual for Vista says this Grace-Grace node has 240 GB of LPDDR5 memory, half the maximum. Again, we hope at full bandwidth.
Now, let’s do some math about, er, math.
Each Grace CPU has 72 cores, with four 128-bit SVE2 vector engines compliments of the “Demeter” Neoverse V2 cores that Nvidia licensed from Arm Ltd to create the Grace chip. Based on the performance figures we got on the Isambard 3 supercomputer at the University of Bristol, which is also based on Grace-Grace and Grace-Hopper nodes, we think that the 256 Grace-Grace nodes in the Vista system will deliver around 1.8 petaflops of peak FP64 performance. And the 600 Grace CPUs running at a slightly higher clock speed in the Grace-Hopper partition will deliver another 2.3 petaflops of FP64 performance, for a total of 4.1 petaflops. Frontera has 9.5X the CPU compute performance in the aggregate, but this is still incremental performance and only a portion of jobs get a big slice of any supercomputer at TACC. The extra 80,064 Arm V2 cores in the Vista machine are a welcome addition to the CPU pool.
An earlier announcement by TACC said that the compute nodes were being manufactured by motherboard and sometimes server maker Gigabyte with system integration provided by Dell Technologies.
The third part of the Vista machine is a 30 PB file NFS system from Vast Data. TACC had been experimenting with a 13 PB Vast Data flash array on the Stampede system and went all in with Vast Data for Vista.
The Vista nodes with GPUs have 400 Gb/sec Quantum 2 InfiniBand switching from Nvidia linking them all together to share work. The CPU-only nodes have 200 Gb/sec Quantum 2 InfiniBand linking them, bandwidth not being as important for CPUs as it is for GPUs.
Based on the funding document for Frontera and various upgrades at TACC since 2018, it looks like Vista cost $12 million. We are confirming this now as well as when the system will be operational.
At 16,000 feet, the Alps’ Mont Blanc will remain I’m sure roughly twice as high as Texas’ 8,800 feet Guadalupe Peak, and 20 times Austin’s Bonnell Mountain. But, yes, who could blame TACC’s Stanzione for seeking to build new Horizons that can indeed elevate Austin TX from its Faulty Balconies (or is it Balcones Fault? And if not, whose fault is it?) and into the stratospheric heights of the Venado (13,000 ft, 130 PF/s peak), and Alps (16,000 ft, 353 PF/s peak) of this world!
The most important question to be answered at this time, by this newly installed Vista onto the vertiginous 388 PF/s peak Horizon of upcoming high performance computational oomph, is clearly not “what can those redlegged, bigheaded, high plains gracehoppers do for you?”, nor is it, interestingly enough, “what can you do for those redlegged, bigheaded, high plains gracehoppers?” (obviously), but rather: which local specialty of food is it that must be associated with this here newfangled machinery? It’s swiss cheese for the Alps, and blue corn tortillas for the Venado, but what of the Horizon? “Inquisition minds” … 8^b
My understanding is academic computing clusters consisting of a combination of GPU and CPU-only nodes have been around since the first Tesla GPUs came out. For example, Big Red 2 at Indiana University (ranked 47 in the Top500 from 2013) consisted of a combination of Cray XK7 GPU nodes and XE6 CPU nodes. At a smaller scale the clusters here (never made the Top500) have been hybrid GPU and CPU-only nodes for a couple generations.
My impression is there are almost no examples of parallel codes that efficiently use both GPU and CPU-only nodes in parallel. Could the main advantage be a single team doing the maintenance or a single filesystem shared between the two different kinds of nodes?
I reckon everything WAS bigger in Texas BEFORE, like in 2019, when Frontera’s 39 PetaFlops looked like this “view between two rows of Frontera servers in the TACC data center” ( https://tacc.utexas.edu/news/latest-news/2019/06/17/frontera-named-5th-fastest-supercomputer-world/ ) … today’s 40 PetaFlop Vista is just microscopic by comparison (what, 7 cabinets on the TNP photo!?). Before you know it we’ll be putting those in our backpockets (they’re still bigger in Texas)!