
The difference between “high performance computing” in the general way that many thousands of organizations run traditional simulation and modeling applications and the kind of exascale computing that is only now becoming a little more commonplace is like the difference between a single, two door coupe that goes 65 miles per hour (most of the time) and a fleet of bullet trains that can each hold over 1,300 people and move at more than 300 miles per hour, connecting a country or a continent.
Speed, whether we are talking about computing or transportation, is exhilarating in its own right, and that is why we enjoy both.
Benjamin Franklin famously warned us all to “remember that time is money,” and in the case of supercomputing, the more money you have, the more time you can simulate at any given time. China has been building exascale systems quietly for years – and apparently has four of them already, with more on the way – and used what most of us would call fairly primitive technology to achieve remarkable simulations.
Albert Einstein proved that energy is mass and also mass dilates time, so you had better bring even more money than you think you need if you want to build exascale systems, considering the enormous energy they require.
Exascale – doing more than 1,000 petaflops at 64-bit floating point precision in our definition – is not a barrier. Some people are so found of reminding us of this, as if we didn’t know the difference between a mental and money barrier and a physical one, like the speed of sound. Breaking the sound barrier took advanced designs of aircraft and engines – and lots of money and time.
Exascale has been a goal for sure, and one that focuses very human energies and capital to attain. And inasmuch as each 1,000X jump in performance – from megaflops to gigaflops to teraflops to petaflops and finally to exaflops and beyond – has required different architectures and ever-increasing budgets, it sure as hell feels like a barrier. Or two, or three if you want to be honest. And the jump to zettascale, which Intel chief executive officer Pat Gelsinger once famously told us the company would reach by 2027, is looking like a particularly daunting task. We can see 10 exaflops, off in the distance, but at perhaps the cost of $1 billion for a machine, that seems pretty pricey. What we see more clearly is 2 exaflops for a lot less money than what China and the United States had to pay for their initial machines in the past few years, using Moore’s Law and signaling and packaging advances to drive down costs as much as drive up performance.
And, based on the core dump briefing on the HPC market ahead of the International Super Computing conference in Europe, it looks like the folks at Hyperion Research agree with us.
Here is the update that Earl Joseph, chief executive officer and analyst at Hyperion, gave ahead of ISC24 on the 45 existing and pending pre-exascale and exascale systems that the company is tracking around the globe:
Joseph was moving fast because the Hyperion team had a lot of ground to cover in their briefings, and really did not say much about this table, which we find fascinating and which we figured you would, too.
This table does not include the massive investments by the hyperscalers and cloud builders, who are amassing arsenals of GPUs for the training of generative large language models in the hopes of extracting money from us all. This is for exascale HPC proper, being installed in the national labs and being used for both AI and HPC work – but mostly traditional HPC simulation and modeling work.
Right off the bat, we knew about the two Chinese exascale systems from 2021 – the 2.05 exaflops peak “Tianhe-3” machine at the National Supercomputing Center in Guangzhou, the 1.5 exaflops peak “OceanLight” machine at the National Supercomputing Center in Wuxi – and we knew that there was a third one possibly based on a variant of an AMD Epyc processor being built by Sugon. But we did not know there was actually a fourth exascale machine already running in China, as this table above clearly suggests. We also did not know that a fifth exascale machine is due for acceptance sometime this year. That is a combined cost of $1.75 billion for five exascale machines that are probably well above 10 exaflops in combined performance. If Hyperion is right, then China will have five machines that will probably have the equivalent power of the entire June 2024 Top500 supercomputer rankings, where the machines add up to 12.5 exaflops of peak performance at 64-bit precision – and possibly even more. And because the Chinese labs are not submitting formal results, the Top500 list looks weaker than it really is. (Those are all Rpeak throughput numbers, not Rmax on the High Performance LINPACK benchmark.)
Here is another fun thing that jumps out: The United States has spent $1.8 billion to get three machines – the 1.71 exaflops “Frontier” system at Oak Ridge National Laboratory, the 2 exaflops “Aurora” machine at Argonne National Laboratory, and the impending 2.3 exaflops or so “El Capitan” machine at Lawrence Livermore National Laboratory – with a combined 6 exaflops or so of compute. China could have gotten twice as much exaflops bang for essentially the same buck, and got to doing the science two years earlier than the United States.
Seems kinda silly when you put it that way, but that is the difference between living in a command economy that has, up until recently, been fairly flush with cash and living in a cowboy economy where budgets are tight for big ticket items. We are fine with keeping the cowboy attitude, but as we expressed with a certain amount of vehemence and frustration earlier this year, in The Future We Simulate Is The One We Create, our conviction with HPC in the United States lacks a commensurate budgetary scale. The hyperscalers and cloud builders located in the United States spent a combined $50 billion on generative AI systems last year, and will likely double that this year. Exascale HPC is a mere measuring cup in that bucket.
Looking ahead, Hyperion things that China will invest in another seven or eight exascale-class supercomputers from 2025 through 2028, for somewhere between $1.95 billion to $2.25 billion and for what will add up to an addition few tens of exaflops of aggregate compute. Over that same time, The United States is expected to put in one or two exascale-class machines each year into the national labs, with costs going down as the years roll forward just as they are going to do in China, for a combined budget that could range from a low of $1.23 billion for four machines at one per year to a maximum of $2.45 billion for eight machines.
Do you want to guess who is going to buy more flops at a lower price and make up the difference in higher electric bills to try to get science done earlier?
We said it before, and we will say it again: We don’t need cheaper computers as much as we need nuclear fusion power on a massive scale to create very cheap electricity to solve the world’s problems. This is the problem. With fusion, we can suck carbon dioxide out of the air – slowly but steadily – to rebalance the environment. We can make things more cheaply and power our lives.
From 2020, when the pre-exascale 513 petaflops “Fugaku” system was built for RIKEN Lab in Japan for $1.1 billion, through 2028, there are anywhere from 47 to 61 machines, with only seven or eight of them being pre-exascale systems, that will drive somewhere between $13.4 billion and $16.8 billion in revenues for the companies that make them. Hyperion says there will be at least 45 machines with at least $13 billion in value that will be accepted between 2020 and 2028.
Given how the upper echelons of the exascale market are unique from the other parts of the HPC systems market, Hyperion has begun carving the so called “leadership computers” that cost $150 million or more out from the “supercomputers” that cost between $10 million and $150 million. Here is the latest reckoning of revenues for on-premises HPC/AI servers from Hyperion for the five classes of machines, from 2018 through 2023, inclusive:
And here is the breakdown of total HPC spending worldwide, by category:
Spending for cloud capacity went up in 2023 and spending for on-premises systems went down, but the market was essentially flat, with combined revenues across systems, software, and services hitting $37.3 billion in 2022 and holding on to $37.2 billion in 2023.
Last year, spending for on-premises HPC servers was $15 billion, down 2.7 percent, and Hyperion is projecting for on-premises server spending to increase by 8.7 percent to $16.3 billion this year. Here is the forecast out through 2028 for servers, storage, middleware, applications, and services for on premises HPC systems:
And finally, Joseph showed five different HPC forecasts since 2021, explaining how hard it has been to make forecasts thanks to the supply chain issues relating to the coronavirus pandemic and then to the generative AI euphoria. We respect the intellectual honest of this chart:
Which just goes to show you that the best way to predict the future is to live it. Still, it is good to have HPC systems to help steer that process towards a desired end state.
I reckon we’re living in the aftermath of Deng Xiaoping’s 10-gallon Stetson, that brought freer enterprise to the PRC, with an enthusiastic spirit for industrial competition. Upon reading the article I initially thought, “heh, we invented this Texas Hold’em market competition Poker game, and while China may have governmental exafloppers, we have a unique set of private-enterprise hyperscalers and cloud providers!”. But true enough, China grew Alibaba, Tencent, Huawei, and Baidu, that develop their own chips, and are broadly competitive with AWS, Azure, Alphabet, Meta, and others.
I reckon it’s a rodeo, where 8-second bull-dogging competes with Tik-Tok barrel rides, all coupled with roughstock 5-card draws, and no time to get comfortable, or there goes one’s lunch box expectations!