China is using a domestic processor as the backbone for double the performance of the Tianhe-2 system, which topped the Top 500 starting in 2013 and running through late 2015 before being overshadowed by the Sunway system in recent years. We have no official public details about Tianhe-3, although the assumption has been that it is/was at or near peak exascale computing capability (recall that China no longer publicly reports LINPACK benchmark results).
Today at the Supercomputing Innovation Application Conference in Guangzhou, researchers announced a new generation of the “Tianhe” machine called “Tianhe Xingyi”.
Reports indicate it is based on an advanced domestic architecture and as mentioned, doubles the performance of Tianhe-2. There is no mention of accelerators, only CPUs.
The best hints we have at this moment is that the new system is based on the MT3000 architecture. This is our own extrapolation, we are waiting confirmation. But…
If we were to make a guess, this is based on an extension of the DSP-driven Matrix 2000+ tech we described in April 2021 when detailing what architectural details we could glean about the Tianhe-3 machine, which was based on China’s native manycore Armv8-based Phytium 2000+ (FTP) and the Matrix 2000+ (MTP) processor/node architecture.
It would make sense that the new variant would be called MT3000 and lo and behold, the following requisition request from this year, seeking time on the system for wind field calculation mentions it directly:
Because the project needs to test and verify the dynamic wind field calculation software environment on a domestic high-performance computing system equipped with an MT3000 processor, it is necessary to purchase 360,000 node hours of domestic high-performance computing system computer time services. The Tianjin supercomputing host system is Tianhe-1 system (TH-1A) and Tianhe new generation system (including E-level verification system). Among them, Tianhe’s new generation system has computing nodes based on MT3000 processors, which can meet project needs. Currently, only Tianjin Supercomputer can provide a domestic high-performance computing environment based on MT3000 processors. Therefore, this procurement is planned to be carried out from a single source.
Another hint we have about the possible evolution of the architecture can be found in this ACM publication from early November this year. The research is from the same group at NUDT behind the original Tianhe machine architecture and could be referring to the Matrix 2000 to 3000 leap, but we are still closely reading to see the connection.
Chinese media on site at the Guangzhou event report that the system will “support various application modes of high-performance computing, AI large model training and big data analysis, which will further enhance the multi-field service capabilities of the Guangzhou Supercomputing Center and effectively meet the supercomputing application needs of various industries.”
Although it may not be connected, at the same event it was announced:
In order to further promote the integration of computing power into the grid and aggregate computing resources represented by supercomputers, high-speed network resources and supercomputing application resources, at the meeting, the National Supercomputing Guangzhou Center teamed up with Guangdong Unicom, China Mobile Internet, Pengcheng Laboratory, and Hong Kong University of Science and Technology Fourteen units including the Fok Ying Tung Research Institute and the Macau China Innovation and Technology Development Promotion Association have officially launched the construction of the supercomputing application Internet in Guangdong, Hong Kong and Macao to jointly build a national science and technology innovation platform to support the national computing power network construction strategy.
Either way, China has no choice but develop their own architectures for this and future large-scale systems. The days of Intel-driven machines are over. Recall that the Tianhe-2 (MilkyWay-2) supercomputer, developed by China’s National University of Defense Technology architecture was built upon Intel Xeon E5-2692 12-core processors with the Xeon Phi co-processor and hit a peak of 54.9 petaflops, a Top 500 chart-topper first in 2013.
Its successor, what we would assume to be Tianhe-3 (details here), was confirmed to exist in some form but since China no longer takes part in the public Top 500 supercomputing ranking system nor shares benchmarks publicly, we are left without official performance comparisons or details.
While Tianhe-2 used US parts (before that was impossible), it did have a native custom interconnect, called TH Express-2. Additionally, the system uses a Linux-based operating system, Kylin Linux, and incorporates a high degree of fault tolerance and scalability in its design, making it suitable for a broad range of complex computational tasks.
We will continue updating this story.
Sign up to our Newsletter
Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.