Deep learning and machine learning are major themes at this year’s annual Supercomputing Conference (SC16), both in terms of vendors showcasing systems that are a fit for both high performance computing and machine learning, and in the revelation of new efforts to combine traditional simulations with neural networks for greater efficiency and insight.
We have already described this momentum in the context of announcements from supercomputer makers like Cray, which just unveiled a Pascal GPU-based addition to their modeling and simulation-oriented XC supercomputer line, complete with deep learning frameworks integrated into the stack. The question was, how many HPC workloads now are exploring deep learning as a facet to their existing workflows? It turns out, quite a few—and the use cases are mounting quickly.
At the core of deep learning hardware, Nvidia GPUs have been central. The GPU maker took this one step further by announcing a broad program for cancer research based on the blend between traditional HPC simulation and deep learning and machine learning frameworks. The company has announced that it will work with several Department of Energy labs and the National Cancer Institute to develop platforms for accelerated cancer research. Given the company’s foothold in both high performance computing and more recently, deep learning and AI, this was a natural fit.
The effort hinges on the Cancer Distributed Learning Environment (CANDLE) framework, which is based on software development efforts at several national labs. Nvidia’s VP of Accelerated Computing, Ian Buck, tells The Next Platform that the joint development effort on the CANDLE framework, which is optimized for both AI and HPC systems, is targeting 10X in annual productivity targets for cancer researchers. The work is driven by the goals set forth in the 2016 “Cancer Moonshot” initiative announced by Barak Obama in his 2016 State of the Union Address laid down heavy targets for preventing, diagnosing, and treating cancer over a five-year span. As one might imagine, investments in research infrastructure, both in terms of hardware and software platforms, to research the various genetic and biological underpinnings of cancer, drug discoveries, and other projects are wide-ranging. Many of these research areas have long-since relied on HPC systems for modeling complex molecular and other interactions. Buck says neural networks can push this one step further by allowing researchers to “roll forward” their simulation work to see more detailed, long-range snapshots of biological interactions, for example.
As Nvidia explains, the framework will be used to look for key genetic markets for common cancers from the National Cancer Institute’s genomic data commons. The data can be used for molecular simulations of protein interactions to help researchers understand the baseline conditions for cancer. These are research efforts that already exist but require large, iterative simulations on supercomputers to understand small spaces within the time steps required. However, using deep learning, that simulation data can be the training ground for a deep learning system to learn and understand conditions and “roll forward” the simulations to show larger interactions and a bigger picture view. “It is a complement to the simulation, but an important one,” Buck says. It cuts down on the time to result and adds to the efficiency of simulation by using the numerical simulation data to feed a learning machine that can provider wider, richer results in more detail.
On the surface, it might sound simple to combine deep learning and machine frameworks with traditional scientific simulation, but this is the great challenge ahead. As Buck explained to us, when using deep learning frameworks like TensorFlow, for instance, to mesh with simulation data, “the outputs are different. We have to change the data collection, how to represent the sequences themselves for DNA, for instance. We need to allow AI researchers to play with neural networks; it’s a matter of expression and specialization.”
Buck adds that other aspects to the HPC simulation are difficult to mesh too. For instance, visualization and scheduling—both things that HPC has figured out over the years. “The questions include how to schedule work across a fleet of nodes, a supercomputers, and how the AI and simulations produce their output. The simulation and learning—neither are low-level frameworks, this is an entire workflow for ingesting data, whether it’s genetic, molecular, or otherwise, and running it and scheduling it on a supercomputer. We can build a workflow based on these components, it’s similar to the one we’ve been using for self-driving cars and other things we are working on our own DGX-1 cluster.”
As a side note, efforts like this are not just happening in molecular dynamics. A similar approach of combining numerical models and simulations with deep learning is also happening in weather forecasting and research. To that end, not long ago, we took a the sudden, new wave of traditional scientific computing research areas that are moving from pure simulation to a more platform-based approach that emphasizes deep learning. In the case of the CANDLE framework, the traditional modeling and simulation phase combined with deep learning will also give way to an AI-based approach that will “automate information extraction and analysis of millions of clinical patient records to build a comprehensive cancer surveillance database of disease metastasis and recurrence.
“Today cancer surveillance relies on manual analysis of clinical reports to extract important biomarkers of cancer progression and outcomes. By applying high performance computing and AI on scalable solutions like NVIDIA’s DGX-1, we can automate and more readily extract important clinical information, greatly improving our population cancer health understanding.” – Georgia Tourassi, Director of the Health Data Sciences Institute at Oak Ridge National Laboratory,
Outside of the obvious human impact value of the research, these efforts are central to Nvidia’s future in GPU computing. As we noted yesterday in our analysis of the newest rankings of the Top 500, the number of GPU supercomputers isn’t rising dramatically. However, with the addition of deep learning to more HPC workflows, having GPU acceleration serve two purposes (application and deep learning acceleration) makes it far more attractive. In many ways therefore, it has been Nvidia’s year once again at the Supercomputing Conference. When the Titan system emerged as the top machine in the world a few years ago, it made the case for GPUs at scale. Since that time, there has been an architectural, or at least accelerator, lull as centers raced to figure out where another competing product, the Xeon Phi coprocessor, might fit in. At this point, Intel has some catching up to do on the AI front—at least for this set of supercomputing simulation customers here this week at SC16.
“Large-scale data analytics — and particularly deep learning — are central to LLNL’s growing missions in areas ranging from precision medicine to assuring nuclear nonproliferation. “NVIDIA is at the forefront of accelerated machine learning, and the new CORAL/Sierra architectures are critical to developing the next generation of scalable deep learning algorithms. Combining NVLink-enabled Pascal GPU architectures will allow accelerated training of the largest neural networks.” – James M. Brase, Deputy Associate Director for Computation, Lawrence Livermore National Laboratory
In an interview yesterday with The Next Platform, the CEO of Nvidia, Jen-Hsun Huang said that there is synergy between the world of supercomputing and deep learning and this will continue. “Physics is physics, and this will continue to be the case,” he said when asked if supercomputing sites might reconsider their architecture if deep learning frameworks are integrated and favored over much more computationally intensive simulations. “This is a revolutionary addition,” he added, saying that the whole field of computational research at scale is in for an exciting time.