One of the reasons this year’s Supercomputing Conference (SC) is nearing attendance records has far less to do with traditional scientific HPC and much more to do with growing interest in deep learning and machine learning.
Since the supercomputing set has pioneered many of the hardware advances required for AI (and some software and programming techniques as well), it is no surprise new interest from outside HPC is filtering in.
On the subject of pioneering HPC efforts, one of the industry’s longest-standing companies, supercomputer maker Cray, is slowly but surely beginning to reap the benefits of the need for this HPC experience for AI.
Not long ago we talked to Cray’s CTO, interconnect pioneer, Steve Scott about how traditional supercomputing can bend to the needs of deep learning at scale in a way that is affordable and usable for those less familiar with HPC systems. Since oftentimes GPU are the center of these systems for training, Scott talked about integrating these into a dense, tightly-coupled system like the company’s Storm line—a product set that is designed for HPC but does not have the Aries interconnect that powers performance for its top-of-the-line XC line of supercomputers.
It was this very integration of GPUs and overall dense performance that pushed Samsung to Cray—a notable movement at a time when most OEMs have wares on the market with the same basic components (Pascal GPUs, choice of CPU, etc.). In a statement, Samsung says that the CS Storm supercomputer will be used for “running artificial intelligence and deep learning applications at scale with very large, complex datasets” for the Samsung Strategy and Innovation Center (SSIC) with focus on connected device and vehicles.
Samsung invested in three CS Storm 500NX cabinets that have up to 8 Nvidia P100 (Pascal) accelerators per node. The companies are not revealing how many GPUs per node they selected, however, if we do the math on the top-end capabilities with 8 Pascal GPUs per node (14 nodes per rack) the peak performance of this deep learning cluster is close to 2 petaflops (even if that metric matters less for reduced or mixed precision workloads like these). Either way, this is powerful for a test cluster, which indicates Samsung has done its homework against rival systems for AI, including possibly the Pascal-based DGX appliance, among other OEM creations with similar feeds and speeds.
The software stack for deep learning has been enriched for the Storm product line as well. In addition the standard HPC libraries and tools, Samsung’s new cluster comes with several popular deep learning frameworks integrated (Caffe2 and CaffeMPI, MXNet, TensorFLow and others and standard GPU libraries including cdDNN.
Back when Scott talked with us about how a Storm cluster could compete with the capabilities in a deep learning specific appliance like Nvidia’s DGX-1 while allowing flexibility to run traditional HPC and analytics workloads on the same system. ”If you look at the DGX-1 from Nvidia, you’ll see it looks a lot like our Storm blades, very much like the next generation Pascal version of the CS Storm and while we haven’t announced the follow-on to Storm, you can imagine how it might look similar,” Scott told us last year. The key difference he highlights is that while Nvidia emphasized DGX-1 beyond in single node context, with Storm, especially in the future with Pascal and Nvlink, they can put this together as a fully engineered, configured, and tested system with many nodes, each sporting eight GPUs (and perhaps more eventually). “Many of our systems on the Top 500 that are highest placed are Storms—we can scale from a single box like DGX-1 to a system with racks and racks of those, scaling up to very large aggregate systems.”
The availability of Pascal took the deep learning game to another level on the Storm, Scott said. “In Pascal with multi-GPU nodes hooked with NVlink, we expect to see a lot of applications for those; some will be distributed across GPUs so that each kernel just runs within a single GPU and others that need to just communicate data between the GPUs in a node. We will have a mix, but deep learning will be one of those that can take advantage of this.”
“At Samsung, we believe the rapid growth of data has untold potential to improve the way we live,” said John Absmeier, vice president of Smart Machines, Samsung Strategy & Innovation Center and senior vice president, Autonomous/ADAS Strategic Business Unit, HARMAN. “But first, we need to understand the technology – leveraging artificial intelligence and deep learning – that provides insights into all that data. Cray’s system helps Samsung do that development, and they even use Samsung’s own solid state drives in their system, providing fast and secure memory access. With Cray’s technology, we look forward to the progress and products our work will unlock.”