Weather forecasting in Europe is going to get a big boost later this year when the European Centre for Medium-Range Weather Forecasts (ECMWF) installs its next-generation supercomputer. This week, the center announced it had signed a four-year deal with Atos worth over €80 million ($89 million) to deploy and support a new BullSequana XH2000 system. It will be the first machine to operate at ECMWF’s new datacenter in Bologna, Italy.
ECMWF was established more than 40 years ago, making it one of the oldest weather forecasting centers in the world. Its first operational forecast issued in August 1979 was derived from calculations performed on ECMWF’s original supercomputer, a Cray-1A. That system was powered by a single processor with a peak performance of around 160 megaflops.
That was just enough performance to compute an atmospheric model with a horizontal resolution of 210 kilometers and 15 vertical layers. It took about five hours of CPU time to generate a ten-day forecast. However, the predictions were not very accurate toward the end of that timeframe, so much so that only seven-day forecasts were issued to the ECMWF member states.
The upcoming BullSequana machine will be about 250 million times more powerful that the original Cray-1A and will support a resolution of 10 kilometers with 91 to 137 vertical layers for its 15-day ensemble forecasts. The ensemble forecast, also known as the probabilistic forecast, requires a lot of computational power since ECMWF runs its model 51 times. The idea is to give users a range of possible scenarios and the likelihood of them happening. The new system will also be able to issue the extended forecasts on a daily basis rather than the current schedule of twice weekly.
Between the Cray-1A and BullSequana XH2000, ECMWF has run through more than a dozen supercomputers of different pedigrees. For example, over the last twenty years, the organization has fielded machines from Fujitsu (VPP series), IBM (pSeries and Power Systems clusters), and Cray (T3D, Y-MP, and XC series). The current workhouse system is a Cray XC40, represented by a pair of identical machines.
The BullSequana system will represent the first Atos system for ECMWF in its 40-year run and the first non-Cray machine installed since 2011. As we have pointed out before, Cray (now under HPE) has captured an impressive chunk of the market for numerical weather prediction, with its supercomputers in residence at about 85 percent of the world’s weather centers.
According to ECMWF spokesperson Hilda Carr, bidders for the new machine were assessed against a range of different criteria, including committed performance, implementation plan, flexibility and risks, quality of technical solution, environmental impact, quality of service and price. That said, there is definitely a trend to buy local these days, and that applies not only in Europe, but China and the US as well. Which is certainly good news for Atos, the only original equipment manufacturer for HPC based on the continent. It is headquartered in Bezons, France, near Paris, but has offices worldwide.
Atos is not new to the weather/climate space. It currently has two petascale Bullx machines installed at Meteo France, the French national weather service, as well as a similar system at Deutsche Klimarechenzentrum (DKRZ), the German Climate Computing Center. The upcoming ECMWF system will be more powerful than all three of these put together and will eclipse the current top-ranked weather modeling machine, an 8.1 petaflops XC40, running at the United Kingdom’s Met Office.
The new BullSequana system will also be the center’s first supercomputer powered by AMD CPUs, in this case, “Rome” Epyc 7002 processors. That shouldn’t come as a big surprise, however. Rome has been pillaging Intel’s server market for more than a year, even before AMD officially launched the second-generation Epyc processor last summer. We reckon Rome’s advantage in price/performance was a big factor for ECMWF, inasmuch as numerical forecasting models have a nearly insatiable appetite for flops for increasing resolution and decreasing turnaround time. And the cheaper those flops are, the better.
The new BullSequana machine will eventually replace ECMWF’s pair of XC40 machines, each of which deliver 4.2 petaflops of peak performance. Although the performance of the new system was not explicitly stated, the Atos press release noted the BullSequana system would “increase ECMWF’s computing power by a factor of around 5.” Since most of the center’s computing capability is currently tied to the two current XC40 clusters, that would imply the Bullsequana system will deliver something north of 40 petaflops, very likely split across a pair of identical systems as is the practice with weather systems.
Few other technical details of the system were provided, other than the system interconnect (Mellanox HDR InfiniBand running at 200 Gb/sec) and the storage component (a DataDirect Networks EXAScaler system, with 91 petabytes of capacity). Note that storage capacity will be boosted even more than compute capacity, with the current XC40 systems connected to just 10 petabytes of storage.
As we mentioned, this is the first system for ECMWF’s new datacenter in Bologna, the High Performance Computing Facility (HPCF), shown in the image above, was moved from Reading, England to the new site in anticipation for future growth of both supercomputing infrastructure and the data, which is continually collected from meteorological observations around the world. The organization’s headquarters will remain in Reading.
The aim is to install the new system in Bologna sometime in the second half of 2020. ECMWF will continue to run the dual-XC40 clusters in Reading in parallel with the BullSequana supercomputer until it is satisfied the new machines operate as expected. If all goes according to plan, that crossover point should happen around the middle of 2021, and then the two XC40s will be decommissioned.