There are no greater bragging rights in supercomputing than those that come with top ten listing on the bi-annual list of the world’s most powerful systems – the Top500. And there are no countries more inclined to throw themselves (and billions) into that competition this decade than the U.S. and China.
Today, the latest results were announced (much more on those here) but notably absent, aside from the expected first exascale machine in the U.S., “Frontier” at Oak Ridge National Laboratory, are China’s results, which if published, would have shown two separate exascale-class machines.
This would have been a major mainstream news story had China decided to publicize its results – and on several fronts.
The most obvious is being first to peak and sustained exascale with double-precision floating point on the LINPACK benchmark (the metric by which supercomputing performance is gauged). Further, this would have been demonstrated on two separate systems with two separate homegrown processor and accelerator architectures. Third, this would have meant several billions in investments in supercomputing technology across two sites (hence serious commitment from the Chinese government over the long haul).
All of this would have shown that despite its own billions in technology investments in the last decade, the U.S. could not arrive first with functional performance at exascale.
Yet China kept this quiet. Well, mostly.
Instead of the press-friendly, mainstream attention HPC gets twice each year they quietly discussed the systems in papers showing real-world application performance. And also, China made sure the word got out in other ways beyond the Top500.
In late October, The Next Platform confirmed and reported that two separate exascale supercomputers – the first with such capabilities in the world – hit above both peak and sustained exascale performance according to LINPACK. Since that time, many have wondered why China would choose not to publish these results given the intensive, public rivalry to secure top system status throughout the last decade.
When we first got word of benchmark results reaching exascale back in April (the benchmark results came in in March, just before trade restrictions cracked down on those exascale facilities and vendors, incidentally), the first inklings came from a contact at a facility in China – one well known to followers of the Top500. The conversation at that time was off record and indicated displeasure that so much engineering work would not be recognized globally, which means the decision to keep results quiet was made early, if not in advance. It took another several months to get enough comprehensive information for us to publish confirmation.
Ultimately, while China might have been able to knock the long-reigning #1 “Fugaku” powerhouse in Japan out of the running, that effect too might not have the lasting impression China hoped for with these dual exascale systems.
With Every Reason to Claim Bragging Rights …
All of this reminds us of all the many reasons China would have had to publicize the results beyond the obvious – claiming the title on not just one, but two, exascale machines. This would have made China the first in the world to an HPC performance milestone that has been the subject of billions of dollars of U.S. investments over the last several years.
A public announcement via the Top500 list in either its June edition or this week would have also drawn attention to the significant material investments China has made in homegrown semiconductor, networking, and software technologies. Much more detail can be found by diving into the Sunway and Phytium architectures and manufacturing backgrounds. And while there are no “new” architectures with either exascale system, they do represent a noteworthy scalability leap, in addition to noteworthy performance in demanding HPC areas that also show the systems’ capability to do mixed-precision (good for AI/ML) and tightly-coupled FP64-driven traditional supercomputing.
Having an HPC complement to its existing large-scale compute infrastructure among companies like Alibaba, Baidu, Tencent and others in China would be another source of bragging rights. These companies are all pushing to build their own native processors, accelerators, and software ecosystems. Having the supercomputing/research side of native technologies would be further signs of strength.
On that note, China would also be able to showcase systems that can handle both general-purpose HPC as well as emerging AI. When results were released for the quantum simulation work on the Sunway system, we believe China was not just showing real-world, tightly coupled HPC performance, but also that it could handle complex mixed precision workloads, which are common in AI (FP16, Int-8, etc). In short, it would be touting both AI and simulation capabilities – a valuable aspect for all emerging large systems – and all without the conventional Nvidia or AMD GPUs as U.S. and European systems deploy for AI, low precision capabilities.
And this may seem minor to those outside supercomputing – but think about it: In addition to showing technological prowess and scalability of multiple homegrown architectures, there is also the lost ability to show the hard work on the part of teams in China, often over a thousand throughout an entire cutting-edge system coming to life (manufacturers, designers, architects, programmers, sysadmins, etc.). That these HPC professionals did not have a chance to celebrate such a milestone on the international stage is a shame. Heated disputes between nations or not, let’s not forget these are people – many of whom have spent careers working toward this coveted goal. This does matter, even if the bigger international picture obscures it.
Competitive Strategy, Perception, and Of Course, Politics
While we have not confirmed a direct, single reason, we have gathered a multitude of views over the last couple of weeks from national lab HPC leads in the U.S., Japan, and Europe, all of whom agreed the lack of publicization is unexpected and baffling but is, generally speaking, purely political. However, given the nuanced views politically and technically, we do have some ideas.
As mentioned above, there could simply be some strategic silence on China’s part for competitive purposes. The Chinese government, which backed these systems to the tune of billions of dollars (not just the design and build but ongoing facilities and power), likely had the final say in the strategic announcement (or lack thereof) of the machines.
What is most interesting is that instead of listing on the Top500, the teams confirmed the systems’ existence through Gordon Bell Prize paper submissions. For reference, this is the most coveted award in supercomputing beyond top system status via the Top500. With its submissions for the Sunway system in particular, these submissions established the machines exist and are in production as well as showcasing performance and scalability – albeit with a cherry-picked set of applications.
That establishes that China was eager to show “real-world” production and use of these systems over claiming the highly publicized top place on the Top500 and crown for first to reach exascale. In short, they get the recognition for technical merit without putting system specs out there for LINPACK or the more real-world focused benchmarks in HPC like HPCG, Graph500, or Green500.
Since China has built systems simply to game the Top500 in the past – including a directly replicated AMD-ish looking system that was later removed from list – one might say these exascale machines are a game. But not so, according to those sources we spoke with for the original story close to the benchmark results. In that case, this is legitimate, the machines are highly capable, and that means the trade war – likely a big part of this story – is also at the heart of this lack of publicizing important results.
The timing on the most recent U.S. restrictions to bar relationships with the labs and vendors behind both exascale systems came in April, a month after benchmarks were run on each system. It is unclear whether the decision to withhold reporting on the achievement was due to waiting for the June Top500 list or for other reasons, but those we spoke with suspect the real delay was to keep from being knocked off the number one spot too quickly by the U.S..
The “Frontier” machine in the U.S. was expected to appear on today’s Top500 rankings at the top of the list, well above either of China’s systems. If China listed in June or for today’s list, assuming “Frontier” had taken the slot followed by “Aurora” at Argonne (with projected 2+ exaflops peak) it would only hold top placement for a relatively short time. That’s important considering the lifespan of these large machines (five years on average) and the potential for new machines to further supplant China, pushing its systems further down the list.
The semiconductor shortage was not expected to impact big systems as much as it did and China likely did not see “Frontier” being off the November list for that reason.
One of the opinions we gathered about why China chose silence one stands out as a bit “out there” on the surface but is worth repeating: if the U.S. and Europe are hell-bent on rolling out several exascale-class systems in the next three years, and China blew its budget on being first – and on two systems to boot – it might be in its best interest to take its ball and go home. In other words, if China “won’t play Top500” anymore, which has long been a yardstick for national supercomputing competition, is that list valuable any longer?
Put yet another way, by choosing to publish prize-geared papers using the machines as a “soft announce” or running LINPACK and letting those results “accidentally” slip without ever publishing, yes China loses the big press day of the top system, but only this last time. The list as a metric is no longer international in the way it’s been for years. The tit-for-tat of top systems has bounced between the U.S. and China for years.
It’s hard to claim dominance when your only real contender won’t come to the plate.
While the Top500 has driven architectures in its decades, from around 2008, it drove competition between the U.S. and China in particular – and with a fierceness that has finally resulted in a flame-out, this time by choice.
What is clear is that China has set itself on its own nationalistic technological path. There are problems with that, not the least of which is a lack of fabs and semiconductor manufacturing prowess. All of that lies beyond its borders – for now (she said ominously). With multiple architectural options to go with, a strong hyperscale base within China to trade hardware and software tooling with, and all the political reason to stay this course for the long term, the news China didn’t make during this Top500 list is much bigger than any announcement it might have.
None of this bodes well for the future of the Top500 list, of course. While its creators have been open about its shortcomings and have built companion benchmarks like HPCG and HPC-AI, for instance, the double-precision floating point metric is less important for bandwidth-limited real-world applications. Even still, the announcement of each list has meant the world pays attention to global supercomputing and that is a big deal – especially for the national labs and organizations that rely on funding for the next big machine. The international competition, especially between the U.S. and China, has also highlighted the growing ambitions of both with HPC as a touchstone topic.
We expect that the current TaihuLight and other Chinese systems on the list will appear until they are decommissioned. And perhaps we won’t see any other top ten-class machines from China for some time, perhaps years. Not because it doesn’t have them, but because it will chose other paths to publicizing.