It has been a long time since the Japan Meteorological Agency has deployed the kind of supercomputing oomph for weather forecasting that the island nation would seem to need to improve its forecasts. But JMA, like its peers in the United States, Europe, and India, is investing heavily in new supercomputers to get caught up, and specifically, has just done a deal with Cray to get a pair of XC50 systems that will have 18.2 petaflops of aggregate performance.
This is a lot more compute capacity than JMA has had available to do generic weather forecasting as well as do predictions for typhoons, tsunamis, earthquakes, and volcanic eruptions – the weather forecasting alone is predicted to run 10X faster, according to Cray.
Historically, JMA has tended to deploy supercomputers made by Hitachi, one of the three vendors of HPC gear that are indigenous to Japan. (Fujitsu and NEC are obviously the other two.) While Hitachi is still technically the prime contractor on the latest JMA supercomputer contract, as it has been for as far back as we can find, the interesting bit is that the underlying machine is now being built by Cray and is not based on processor and interconnect technology from IBM, as the past several systems deployed by Hitachi on behalf of JMA have been.
Way back in the dawn of time in 1995, JMA had a four-processor Hitachi S-3800/480, which was capable of a peak performance of 32 gigaflops at double precision. Five years later, it moved to the SR8000-E1/80, known as the Super Technical Server, which had 80 RISC processors running at 300 MHz, each capable of delivering 9.6 gigaflops. This system was based on a custom processor and ran Hewlett Packard’s HP-UX Unix variant. In the configuration tested on Linpack for the Top 500 ranking of that year delivered 768 gigaflops peak and 691.3 gigaflops sustained. While this machine eventually scaled up to 512 nodes, JMA did not push the scalability of this shared memory supercomputer to anywhere near its limits. Faster processors peaking at 450 MHz and delivering 14.4 gigaflops each were also available but were not, as far as we know, deployed.
Five years later, JMA installed three Hitachi SR11000 systems – two K1/80 models with 80 processors and one J1/50 model with 50 processors – for a total peak capacity of 27.6 gigaflops peak and 23.1 gigaflops sustained, which is not a bad ratio at all. The pair of SR11000-K1/80 machines did the production work in synch, as is often done with weather forecasting systems in the event that there is a failure in one system. The SR11000 machines were based on various generations of IBM Power processors, with the J1 being based on the Power5 chip and the K1 being based on the Power5+ follow-on to it. The three machines had a grand total of 13.1 TB of main memory and 18.6 TB of disk capacity, which does not sound like a lot these days, obviously.
In the summer of 2012, the last time that JMA did a system upgrade, it moved to the Hitachi SR16000-M1, which is based on IBM’s Power7 processor and which specifically was enclosed in the Power 770 water-cooled server that was deployed in a number of HPC centers in the United States and Europe. This single machine was rated at 847 teraflops peak; it had 108 TB of main memory across its nodes and 348 TB of disk capacity. It is interesting to note, of course, that JMA did not choose a “Newell” Power AC922 node crammed with Power9 cores and Tesla Volta V100 GPU accelerators to replacing this all-CPU system that predates it in the IBM line.
In fact, the replacement machines that JMA is putting in place are all-CPU systems as well. Each of the two Cray XC50 systems that JMA has bought through Hitachi has 2,816 nodes, with each node having 96 GB of main memory (two memory slots are open) and using a pair of 24 core “Skylake” Xeon SP-8160 Platinum processors, which run at 2.1 GHz. So each machine is rated at 9.1 petaflops peak, and we suspect we will see what the sustained rate on Linpack is for these systems on the June 2018 Top 500 rankings at ISC18 next month. That’s a total of 67,584 cores and 264 TB of main memory per system, a radical jump in performance for JMA. Within each of the XC50 systems, JMA will be using the “Aries” interconnect developed by Cray and will be deploying it in a dragonfly topology.
The systems will have Lustre parallel file systems attached to them, which weigh in at 5.3 PB each, and the total capacity of storage used by the pair of machines will come to 31 PB.
The pair of Cray machines will be housed in the JMA datacenter in Kiyose, in the northwest of the capital of Tokyo. We don’t often get a sense of what HPC centers pay for machinery, but the Japan Economic Newswire is reporting that the pair of systems cost ¥4 billion (about $36 million) and that another ¥6 billion ($54 million) over the next five years to run the machines. That works out to $1,981 per teraflops for acquisition and $4,954 per teraflops for acquisition and operation. This is nowhere near the aggressive pricing that we see for the cutting-edge HPC iron created for the world’s largest supercomputer centers, who take on a lot of the risk in the development of new architectures, but the costs have no doubt come down on the same curve for JMA and other weather centers.
For Cray, this is another big win for its weather forecasting business, but notably the National Oceanic and Atmospheric Administration in the United States just went with a Dell PowerEdge cluster using earlier generation (and less expensive) “Broadwell” Xeon processors and 100 Gb/sec EDR InfiniBand interconnect from Mellanox Technologies for its latest “Mars” and “Venus” system upgrades. In recent years, Cray has won big weather forecasting system deals in the United Kingdom, Korea, Germany, Australia, and India, among others.