Deep Learning Is Coming Of Age

In the early days of artificial intelligence, Hans Moravec asserted what became known as Moravec’s paradox: “It is comparatively easy to make computers exhibit adult-level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility.”

This assertion is now unraveling primarily due to the ascent of deep learning. While AI was in its childhood in 2014, by 2020 it will reach a fundamentally different stage of maturity – it will be coming of age, primarily as a result of deep learning. This rapid and fundamental transformation across industries and fields is happening not as a product of a one-dimensional breakthrough event, but rather through the collection of advances on a number of different fronts.

Topology Evolution, Growth in Dataset Size, and Complexity

As datasets have become more complex over the years, DNN topologies have evolved. While in 2012 it was challenging for a neural network to distinguish cats from dogs in images, today visual AI supports real-life applications such as the identification of potentially malignant cells in 3D medical imaging.

The initial progress with neural nets is well illustrated by the ImageNet project, an annual competition in visual recognition. Many different topologies have been developed to solve the ImageNet classification challenge. Solutions have improved at a very fast pace and have resulted in major changes to the structure of the DL topologies, allowing for the processing of significantly more complex datasets (for example, high resolution 3D imaging or context-rich real-time language translation).

Shifting Focus To Inference

As deep learning applications move from exploration into deployment, there will be a clear shift in the ratio between cycles of training and inference from 1:1 in early days of deep learning, to a projected ratio of well over 1:5, favoring inference, by 2020. Further, real-world applications have ever-stricter latency requirements, demanding lightning-fast inference. This brings inference acceleration into the spotlight for many hardware and software solutions.

Deep Learning Frameworks Help Democratize Data Science

Early AI techniques traditionally involved deep expertise in the programming of data parallel-machines and specialized statistical techniques. The rapid maturation of deep learning has been supported by a parallel development in the number of frameworks, such as Theano, Caffe, Torch, TensorFlow, PyTorch, and MXNet, that abstract away lower level dependencies and facilitate deep learning implementation at scale. The availability of these frameworks has spurred an immense increase in the number of data scientists and AI practitioners, contributing to the democratization of deep learning itself, and further accelerating the dissemination of DL-based applications.

Hardware Architectures For Heterogeneous Deployments

CPUs, libraries, and other software capabilities were initially not optimized for deep learning, but have gone through significant optimization and have demonstrated significant performance improvements. Many topologies benefit from processing on CPUs, like the “Skylake” Xeon Scalable processors, because of the memory requirements or hybrid workloads involved. Additionally, since Intel Xeon Scalable processors are relied upon for so many other enterprise workloads, leveraging them for AI comes at minimal extra cost.

For customers whose deep learning demands grow more intensive and sustained, a new class of accelerators is emerging, emphasizing a very high concurrency of large numbers of compute elements (spatial architectures), fast data access, high speed memory close to the compute, high speed interconnects, and multi-node scaled solutions. The Intel Nervana Neural Network Processor (NNP) will deliver on all of these fronts. Similarly, Intel Movidius Vision Processing Units (VPUs) offer capabilities for media processing and deep learning for low-power edge devices.

Increased Breadth Of Deployment

By 2020, deep learning will move from early adoption to early maturity, with deployments moving from experimental into the main lines of business. We will also observe a more accelerated adoption within academic, scientific, and government environments. Such broad deployment requires solutions that cover a wide range of settings – from sub-watt end devices, to megawatt racks, and all that is between. Rolling out deployment in line of business implies shifting the criteria towards scale, flexibility, power, and cost efficiency.

Increased Accuracy In Performance Assessments

Earlier hardware performance metrics are not a representative measure of performance for AI. Peak theoretical tera operations per second (TOPS) measure the raw compute potential of a platform, which rarely happens in AI applications. It does not capture meaningful real-world utilization constraints such as data residency and reuse, data flow and interconnect, and workloads that combine compute types. More relevant benchmarks are starting to emerge, such as effective TOPs (or throughput), power efficiency (inferences per second per Watt), latency, and total cost of ownership (TCO).

What It Takes To Win At Scale

The upcoming larger and broader deployment of AI will demand a comprehensive portfolio of products and platforms. Constraints vary significantly across compute environments, and highly efficient solutions demand completely integrated systems with optimized interactions between CPU, accelerator (when needed), memory, storage, and fabric.

The key elements of a leadership solution include a proliferated highly effective CPU, like Intel Xeon SPs, as a flexible and powerful platform foundation, combined with a portfolio of specialized accelerators, a tightly integrated system, and a strong software stack supporting all the popular deep learning frameworks. These are the considerations we have in mind as we build our product roadmap to best support AI on Intel architecture.

Gadi Singer is vice president of the AI Products Group and general manager of architecture at Intel. For more from Singer, please connect with him on Linkedin. For more on Intel’s hardware and software solutions for AI, please visit ai.intel.com.