Large-Scale Weather Prediction at the Edge of Moore’s Law
May 25, 2016 Nicole Hemsoth
Having access to fairly reliable 10-day forecasts is a luxury, but it comes with high computational costs for centers in the business of providing predictability. This ability to accurately predict weather patterns, dangerous and seasonal alike, has tremendous economic value and accordingly, significant investment goes into powering ever-more extended and on-target forecast.
What is interesting on the computational front is that the future of weather prediction accuracy, timeliness, efficiency, and scalability seems to be riding a curve not so dissimilar to that of Moore’s Law. Big leaps, followed by steady progress up the trend line, and a moderately predictable sense of how progress could move along are no longer such a sure bet. As the weather modeling agencies look to the future of both systems and software, a reality check seems to be settling in, especially for what might be possible in 2025—just under a decade from now.
Applied to weather prediction, however, this parallel is not as simple as transistor counts and economies as scale. But just as other application areas are floundering in the post 2020 prediction pool in part because of declining computational efficiencies set against more complex modeling requirements and possibilities, weather has its own sets of concerns.
As we have described here in the context of new machine and research programs at NOAA, the national weather service in the U.S. and elsewhere in the world, the challenges over the next ten years for calculating in higher resolution over smaller geographic distances are far less focused on the ability to procure more “pure compute” and far more centered on scaling existing approaches in the wake of even more data from satellites and other instruments and the need for more focused resolution. For instance, one of the most highly respected global forecasting centers can run simulations at 9 km resolution, with an eye on 5 km over the next ten years. But because of limited scalability of numerical prediction models and the relative inefficient of available compute in the face of those demands, 1 km resolution is still out of range.
Those outside of the United States might not have heard of the European Centre for Medium Range Weather Forecasts (ECMWF), but chances are, there has been direct contact with their forecasting development work, even outside of Europe in the form of globally adopted numerical weather prediction models from ECMWF that have refined over the course of their 40-year history. As Dr. Erland Kallen, Director of Research at the center described during the recent Exascale Application and Software Conference described, however, the resolution available now as its runs across their dual Cray XC30 supercomputers (two for backup purposes and for running experimental models in the background, more on that here) marks a significant improvement over what was possible ten years ago, but global collaborators on the scalability research side are seeing where the efficiencies start to peter out with more complex models—even for models that are relatively simple to parallelize.
To put some of the resolution problem into context, consider that a lot of different temperature, pressure, and other atmospheric and ground conditions can change or be diverse in a 9 km range. There are parameterizations to account for this, but this must also take into account the vast amount of data that must be culled, cleaned, and integrated into several forecasts that are then “averaged” together. These are ensemble forecasts, to put all of this in layman’s terms, but these come with high computational cost—and rapidly increasing costs in the near future.
Centers like this have a wide range of data sources, all told, adding up to 40 million observations processed each day. These serve as the baseline for daily models. Satellites alone contribute a great deal to this bottom line of data. Kallen says that in the 1990s there were only ten and the data that came from them was very difficult to use. There are around 70 now from several countries, and ECMWF expects closer to 85 by 2018. This is not just a data volume issue, either. “There is also a major filtering problem; selecting the data, processing it and throwing out the parts of it we don’t need is also computationally intensive,” he notes. But the net result of this collection and cleaning is at the core of their ensemble assimilation and prediction approach every 24 hours. This has been difficult to parallelize, will be tough to scale, and will represent at least one of the cores of their next generation challenges on the scalability, efficiency, and production fronts.
“We don’t think our member states, or anyone else for that matter, will be paying us for an electricity supply of more than about 20 megawatts and that’s our estimate for what we’ll need for the next 20 years. That power limit puts a cap on the resolution.”
On that note, take a look at the chart Kallen shows below. The observations will increase by a factor of 5, but the model complexity is a far greater jump “and we will not have the computers we need to do this,” he says.
The combination of model complexity, model resolution, and compute power—all wrapped in a tight energy consumption package on par with those stated by the various exascale initiatives (20 megawatts) create an uncertain future for a continued rise in the accuracy and timeliness of medium-range (3-10) day forecasts, not to mention other global climate studies. “Our ensemble runs 50 forecasts in parallel, it would be even better to do hundreds at a time, but that’s a lot of compute power, even if parallelizing that isn’t a huge challenge. Scaling that is the next big challenge across all parts of our forecasts, data, and models.”
For some insight into the need for scalability coupled with efficiency, take a look at the following chart what shows the efficiency gains needed by ECMWF over the next decade, going from 30 km resolution down to below 1 km (which is not attainable in any foreseeable future). For background, the ensemble represents coarser resolution (16 km), the single-shot model is much higher resolution (9 km) and today the center is already seeing the peak of that scalability range coming in 2025.
To put this in perspective, that 2025 range running ensembles at 5 km resolution would translate to over a million cores and a single system at high resolution would be well over a million. Even if the scalability of these numerical weather prediction models was not an issue, the power consumption at that level is a deal-killer.
“For high resolution, kilometer scale is unrealistic for the next ten years. A 5 km resolution by 2025 is not unrealistic but requires a lot of work scalability-wise. In our ten-year strategy, a 5 km ensemble is the key goal and keeping the compute at 20 megawatts of electricity in 10 years is another target.”
In effort to target these scalability and hardware-oriented challenges ECMWF is working with a number of regional and international research organizations, as well as hardware vendors Cray, Intel, Nvidia, Seagate, Bull, and software vendors, including Allinea to keep scaling. Still, the Moore’s Law-like curve of weather forecasting timeliness, accuracy, range, and efficiency is flattening—yet another indication that the work that goes into exascale ambitions will have an immediate impact on supercomputing-fed services the world depends on.