Historically, high performance computing has operated in a closed loop, enclosed in its own software cocoon. Libraries like MPI, OpenACC, and BLAS, are barely known outside the confines of the HPC world.
This is mainly because high performance computing was traditionally performed on machinery that had little in common with the rest of the industry and was applied to the kinds of science and engineering simulations that had unique computational demands with regard to both performance and computational scale.
Custom-built HPC hardware mostly fell by the wayside 30 years ago, when commodity clusters became all the rage. And now, with the advent of large-scale cloud computing and performance-demanding applications like deep learning, the exclusive nature of HPC software is fading away. Of course, there is bound to be some resistance to this from die-hard supercomputing enthusiasts, but a new generation is paving the way for a more inclusive approach.
One of them is Michela Taufer, a computer scientist who heads the Global Computing Laboratory (GCLab) at the University of Tennessee, Min H. Kao Department of Electrical Engineering & Computer Science. Taufer, an ACM Distinguished Scientist, is also the general chair of this year’s supercomputing extravaganza, SC19. Her day job running GCLab involves managing the various projects under the lab’s purview, as well as the students who work there.
A significant research area at the lab is data analytics technology, much of which originated outside the high performance computing community, but has been subsequently modified for HPC environments. A telling example of this approach is Mimir, a MapReduce implementation that uses MPI. Unlike the traditional cloud computing version upon which it is based, Mimir is designed with scalability in mind and is designed to run on the largest supercomputers. It also exploits in-memory processing and data staging to maximize performance at the node level. It can be accessed via GCLab’s GitHub software portal.
We got the opportunity to speak with Taufer to get her sense of how the field is changing, especially how data analytics is transforming HPC workflows and how the software is being shaped by outside forces, especially the deep learning community.
One of the big drivers for these changes, she says, is the growing importance of data to HPC workflows. In some ways, it’s a subtle change, inasmuch as data always drove simulations and the resulting visualizations of those simulations. But many of today’s workflows are a good deal more complex, which has tended to shift the emphasis from compute to data. Part of this is just a philosophical shift, predicated on the belief that there is knowledge embedded in data that can be extracted with the right tools. That’s where deep learning has become so valuable, she says.
A more obvious explanation of this newfound appreciation for data is that there are simply more devices like remote sensors and other types of scientific instruments for gathering bits and bytes that are being fed into HPC systems. Taufer says that means we need to “shed our reliance on these big simulations,” and expand our workflows beyond the supercomputing center. Integrating these external data streams into our models is big challenge, she admits, but one we need to pursue.
The other significant change to workflows is the incorporation of deep learning, which from Taufer’s perspective is just another data analytics technique. Deep learning can be applied to different parts of the HPC workflow, most commonly to help filter raw input data and to guide simulations more intelligently by reducing the parameter space.
Most simulations involve searches of one sort or another, explains Taufer. That might mean locating an anomalous signal from a sensor network, looking for the signature of an atomic particle decaying, or finding the seismic trace that indicates an oil deposit. Using deep learning to streamline these simulations means less time is spent do brute-force computations, making the whole process a good deal more efficient.
According to Taufer, this technology has special value in the area of molecular dynamics, a computational method that can be applied to critical health applications like drug design, disease studies, and precision medicine. These flops-demanding simulations often require leading-edge supercomputers to create useful models, so reducing the computational burden with deep learning can open up these areas to researchers with more modest resources.
One of GCLab’s projects involves in-situ data analysis of molecular dynamics data in flight, using machine learning and other analytics approaches to speed the workflow. In other cases, such as DeepMind’s AlphaFold application for protein folding, machine learning has actually made the simulation superfluous.
Her general sense is that a small community like HPC benefits from a collaborative approach and one that is interdisciplinary in nature. Bringing together computer scientists, engineers and domain experts is not only advantageous to most HPC project these days, it’s often necessary. Fortunately, the students she mentors in her lab seem more open to this kind of approach and less inclined to work in the software silos that defined HPC of years past. “We feel that this is the right way to move forward,” says Taufer.