Earlier this week, the RIKEN Center for Computational Science (R-CCS) announced that in August it will pull the plug on its flagship HPC system, the K computer. The 11-petaflop supercomputer is still in working order, but since the upcoming Post-K system will be installed in the same computer room, RIKEN needs to get busy preparing the facility for the new exascale machine.
When K was installed in 2011, it was the number one system on the TOP500 list. It’s currently ranked as the 18th most powerful supercomputer in the world. For its time, it was also one of the most energy-efficient supercomputers in the petascale club. It now sits at number 199 on the Green500 list, which is not too shabby for an eight-year-old machine.
The decision to shut down K in advance of the Post-K deployment came down to the logistics of transitioning from one system to the other. According to R-CCS director Satoshi Matsuoka, some of the facility’s infrastructure will be repurposed to save costs, but upgrades will be needed to double power and cooling capacity needed for Post-K. The fact that center only needs to double power and cooling for the new machine is pretty amazing, considering it’s expected to deliver 100 times the performance of K.
The other reason for the early shutdown is that since the facility is already full, RIKEN will need the floor space occupied by K when the Post-K racks start rolling in, which according to Matsuoka will begin later this year. As a result, “running K while building Post-K became impossible,” he explained.
Like its petascale predecessor, Post-K is being developed by Fujitsu. The prototype debuted in June 2018, followed by an the unveiling of the system’s A64FX chip in August. If all goes according to plan, machine will go into full production sometime in 2021.
The transition from K to Post-K is more than just a passage from petascale to exascale. It also reflects the changing preferences for modern HPC machinery. The original K was powered by Fujitsu’s custom 8-core Sparc64 VIIIfx chip. In the A64FX, the K lineage has moved to Arm, in this case a special breed of Arm processor that incorporates Scalable Vector Extension (SVE) capabilities. The initial implementation offers 48 cores, but the one that shows up in Post-K could have considerably more.
In Fujitsu’s design, the A64FX’s SIMD hardware will be 512 bits wide. That’s big enough to crunch on eight FP64 values at a time, which will make it four times speedier at vector math than K’s Sparc64 VIIIfx, assuming equivalent clock rates. The A64FX will also offer FP32, FP16, INT16 and INT8 instructions to support the kinds of math operations typically used in machine learning.
The move from custom to standard processor architectures reflects a long-term trend in the HPC space that begin in earnest back in the early 90s and continues to play out today. As we recently reported, that could start to swing back the other way thanks to the deterioration of Moore’s Law and the rising importance of machine learning. But for now at least, standard architectures have the upper hand.
The more recent enthusiasm for chips to support lower precision math for machine learning is sweeping through all processor architectures, standard or not. Lower precision math is also being considered for traditional HPC as a way to increase throughput and energy efficiency. As a consequence, it’s pretty much assured that all processors in exascale supercomputers, not to mention generic datacenter machinery, will be supporting lower precision formats for the foreseeable future.
That said, Fujitsu’s new Arm chip, like its Sparc64 forebearer, is still very much a purpose-built HPC processor. And as we noted previously, the A64FX does inherit the superscalar processing, out-of-order execution, and branch prediction capabilities of the Sparc64 architecture. Post-K will also continue down the path of Fujitsu’s custom Tofu interconnect, which was originally developed for the K computer. The Post-K version represents the third generation of architecture, known as Tofu D, which we describe in detail here.
The shuttering of K is bound to cause some disruption for users. Matsuoka assures us the other HPC centers in Japan will pick up the slack in the interim until Post-K comes online, which by the looks of things, shouldn’t be too long. He says select users should get early access to the system in the first half of 2020 before it becomes generally available in 2021.
Sign up to our Newsletter
Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
It should be noted that K was #1 in HPCG, besting systems with ~10x the HPL FLOPS until Summit and Sierra came out. Its still #3, which is amazing consindering how old it is and a testament to both the 93% computational efficiency of Rmax/Rpeak and its Tofu interconnect. It has >3x the HPCG efficiency of Summit and Sierra(5.3% vs 1.5%).