
The International Super Computing 2025 conference is going on this week in Hamburg, Germany and is celebrating its 40th anniversary. It is also the 65th compilation of the Top 500 supercomputer rankings, which was started way back in 1993, which certifies the peak theoretical performance and sustained performance of each machine ranked on the High Performance Linpack (HPL) parallel processing benchmark test.
Over the decades, when the Top 500 list came out each June with ISC in Germany and each November with the SuperComputing (SC) conference in the United States, we did exactly what you did: We looked for the new machines on the list and tried to see how each new machine and its architecture stacked up to its rivals and alternatives.
We have complained ad nauseum about how this twice-a-year list does not actually rank the top five hundred supercomputers in the world, and has not for a long time, but merely ranks the highest performing machines that submitted certified HPL performance metrics. Supercomputers used to be synonymous with the upper echelons of “high performance” or “technical” computing, and the list started out being pretty faithful as a proxy for systems running HPC simulation and modeling workloads as their day jobs. But over time, telecom, service provider, and cloud vendors have – for nationalistic reasons – submitted results on the HPL test for on systems that do not do HPC as their day jobs.
Moreover, there are plenty of military and research lab supercomputers that do not make the list, and the Chinese government has not submitted results for its capability-class supercomputers for several years. We have big HPC machines that are not on the list and smaller commercial clusters that are carved up and treated like they are HPC machines when they are not. This all obviously skews the list and distorts the HPC reality.
So, starting with the June 2024 Top500 rankings, we decided that the best way to use the list was to look at general trends and then carve out new machines that were added since the previous list. This is kind of an architectural compare and contrast as well as a kind of buyer’s guide. Which is what benchmarks are really all about anyway once you strip away the politics and nationalism and healthy competition between HPC centers and countries.
On the June 2025 Top 500 supercomputer list, which you can see here, the only new machine in the top ten systems is the “Jupiter” booster system at Jülich Supercomputing Centre in Germany, created under the auspices of the EuroHPC exascale effort and built by Eviden (formerly known as Atos/Bull). The booster part of the Jupiter system is a partition that has most of its 64-bit floating point performance located in GPU accelerators, in this case “Hopper” H200 GPU compute engines from Nvidia, and it is also the largest and most expensive part of the Jupiter machine, which will have other kinds of nodes as well, including Arm server CPUs clustered together old-school style as well as neuromorphic, quantum, and visualization modules.
The Jupiter Booster has 4.8 million CPU cores and GPU streaming multiprocessors in total, and has a peak theoretical performance at FP64 resolution of 930 petaflops and delivered 793 petaflops on the HPL test, which is a computational efficiency of 85.3 percent. The Jupiter Booster burns 13.1 megawatts.
This is not too shabby considering that similar but much smaller Grace-Hopper machines installed at ExxonMobil and NCSA at the University of Illinois had only a 53.2 percent computational efficiency. These were using Hewlett Packard Enterprise’s “Rosetta” Slingshot Ethernet interconnect instead of Nvidia’s InfiniBand, so that might have something to do with it. The other big Grace CPU-Hopper GPU system was the “Isambard-AI Phase 2” system at the University of Bristol in England, which delivered 216.5 petaflops on the HPL test with a 77.2 percent computational efficiency.
Add them all up, and the five Grace-Hopper supercomputers had 6.84 million cores and delivered a total of 1.48 exaflops at FP64 precision, and all told represented 46.8 percent of the peak performance across the 48 systems that were new on the June 2025 Top 500 rankings.

The next biggest block of new compute added for the June 2025 list were the eighteen machines that had Intel Xeon CPUs on their hosts and Nvidia GPUs. These machines had a total of 1.19 exaflops of FP64 performance across those eighteen systems, comprising 37.5 percent of all new capacity added since the November 2024 rankings.
There were no new systems on the June 2025 list that were based on Intel Xeon hosts and AMD GPU accelerators (no surprises there), and similarly there are no new machines based on Intel CPUs paired with AMD GPUs or systems using just a pair of Grace CPUs from Nvidia inside each node (meaning, without GPUs). There were, as you can see at the bottom of the list, two machines going into Deutscher Wetterdienst, which is the German meteorological service, based on the Aurora vector compute engines created by NEC.
Finally, you will note six new machines that pair AMD Epyc CPUs to “Antares” GPU blocks inside the MI300A hybrid compute engine; some of these use InfiniBand, some use Slingshot. These machines comprise 4.2 percent of the net-new Rpeak performance of the 48 machines added to the Top 500 rankings this time around. The pairing of AMD Epyc CPUs and discrete AMD Instinct GPUs happened in two of the systems installed between November 2024 and June 2025, and these machines accounted for 3.4 percent of Rpeak capacity added in the June list. Systems based on AMD Epyc plus Nvidia GPUs added another 2.8 percent of installed peak capacity. And machines based just on AMD Epyc CPUs or Intel Xeon CPUs added in another 1.8 percent and 2.7 percent of the total 3.17 exaflops of FP64 capacity that was in new machines on the June ranking.
Add it all up, and there were 48 new machines with 3.17 exaflops of peak FP64 performance, which yielded 2.31 exaflops of sustained HPL performance. Last June, there was a net new 1.23 exaflops of Rpeak performance, and in November 2024, as some big systems like “El Capitan” at Lawrence Livermore National Laboratory, came online and were tested, there was 5.21 exaflops of net new FP64 performance added to the Top500.
While the CPU-only architectures are interesting, given the desire to run both AI workloads and HPC workloads on these supercomputers it is the machines with accelerators that we are keen on following.
On the June 2025 Top500 ranking, there were 232 machines using accelerators, showing that this architectural approach is still growing, but only dominates among the largest machines:
Here is a treemap of the accelerated supercomputers on the entire Top500 ranking this time around:
The treemap shows the architecture of the main compute engines by type in different colors and the size of each quadrilateral shown tells you the relative compute capacity for each machine on the list.
And finally, here is a drill down on the accelerated machines on the June list:
Here is the upshot. On the June 2025 list, accelerated machines comprised 46.4 percent of the five hundred most powerful machines that were certified running the HPL benchmark. But the accelerated machines comprised 59.7 percent of total cores on the list and a whopping 86.2 percent of all of the peak flops on the June 2025 list.
All told, the five hundred machines on the list had 20.61 exaflops of aggregate performance and had 137.6 million cores of concurrency.
One last thought. For the longest time, the performance of supercomputers grew at the Moore’s Law rate of doubling every 18 months or so. This has not been the case for many years, as you can see:
This chart ignores the fact that the hyperscalers and clouds have very powerful and capacious clusters that would smoke any HPC center in the world when it comes to performance. They each can leverage hundreds of thousands to millions of compute engines to run a single workload any day that they want to. Any one of them — Amazon Web Services, Microsoft Azure, or Google — could, in fact, own this entire list all by themselves if they wanted to make a point.
Great analysis! It’s interesting to consider that hyperscale datacenters, running in the 100s of megawatts and more (esp. GW), may indeed “smoke any HPC center”, or just about, and possibly “own this entire list” if they were to convert to “HPC as their day jobs”. To me this means that the technical capability is there today to run much closer to the (FP64) Zettascale, and maybe that should be experimented (scientifically) at some point … taking all these computational facilities as one humongous composable disaggregated heterogeneous system!
JUPITER (Booster module, #4) is a nice addition to this June’s Top500 and I wonder how preliminary/partial it is at this stage, especially as tomorrow’s ISC2025 talk by Kristel Michielsen ( https://isc.app.swapcard.com/event/isc-high-performance-2025/planning/UGxhbm5pbmdfMjY3OTY2Mg== ) seems to suggest that it is currently running on 24,000 GH200s, while one may have expected 32,000 ( https://www.nextplatform.com/2024/06/25/ruminations-about-europes-alice-recoque-exascale-supercomputer/ ). I’m also happy to see “Isambard-AI phase 2” entering at #11 which bodes well for that machine (Isambard-3? Isambard-4AI?).
One big update to me though remains the swachbuckling buccaneer’s HPCG #1 score! Finally, after 5 long years (10 Top500 lists), there’s a new sheriff in HPCG Town! Not by much mind, but at a better power efficiency point relative to the previous CPU-only leader (#25 vs #104 in Green500).
Thanks for that! Nothing like a round of Top500 rodeo to put the brain clutch back into gear I say. It’s sweet to see the computational Rock that is Intel Granite Rapids 96-cored Xeon 6972P enter the competition (riding Mr.DIMM?), albeit at a timid 173 Nibi (Canada) and 457 FusionServer (Norway) — that tag-team has such a wild HPCG potential ( https://www.phoronix.com/review/intel-xeon6-mrdimm-ddr5/3 )!
I reckon Nibi’s out of U. Waterloo’s Sharcnet ( https://docs.alliancecan.ca/wiki/Nibi ) and I have to think its reported 8 PF HPL is from more than just 6,144 cores (32 nodes) … looking around to Roxy/170 and Narwhal/172, and scaling up from FusionServer’s 2.6 PF at 41,472 cores suggests they used the whole 700 nodes instead (134,400 cores), or so. That typo excepted, It’ll be great to see where these Rapid Granite beasts land on the first page of the HPCG Jamboree (once they run that test)!
There you are. I was beginning to wonder. We can’t lose you and HPC Guru on the same day.
“Accelerators Can’t Bend Performance Up To The Moore’s Law Line” because we’ve been using accelerators for more than a decade. Look at the #1 line, and those systems have all been accelerated since 2012. Meaning they use a large number of parallel, simple, low-clock execution units. Since that time GPUs have been improving approximately with lithography improvements. Sure the per-socket performance has gone up faster, as they pack bigger multichip modules into a socket; however, supercomputers are limited by either power or budget – and the per-watt and per-money performance has been going up with the chip fab process. Since the chip-fab improvements have fallen off the prior trend line, so too are supercomputer improvements.
The Top500 curves were rarely on the Moores law line, but saw 3 major inflections that helped pull things up a big way: The shift from SMPs to MPPs with commodity processors, Multi-core CPUs, and commodity Accelerators. Until we find another one of those inflections, it’s likely improvements will be slow.
To be fair, Linpack never was representative of all tasks, and many HPC workloads are more limited by memory bandwidth and/or network bandwidth. These workloads have been improving on different curves, much less generous than Moore’s law, for a very long time.
It should also be noted that since those early GPU systems there has been a huge improvement in the number of codes that can actually use GPU effectively. Some of that is simplifying the programming tools to run codes on GPUs, but a lot of it is just the accumulated time and effort of thousands of scientists refining their codes over many years. So maybe the maximal performance line isn’t getting better that quickly, but the real-world throughput of the machines has improved substantially.