Following yesterday’s acquisition of deep learning chip startup Nervana Systems by Intel, we talked with the company’s CEO, Naveen Rao, about what plans are for both the forthcoming hardware and internally developed Neon software stack now that the technology is under a much broader umbrella.
Media outlets yesterday reported the acquisition was $350 million, but Rao tells The Next Platform it was not reported correctly and is actually more than that. He was not allowed to state the actual amount but said it was quite a bit higher than the figure given yesterday.
Nervana had been seeking a way to expand its business, including the possibility of raising another round of funding to complement the existing $28 million. Acquisition was already on the table and the company had been in touch with Intel for two years. This is not surprising given the use of Nervana technology at Baidu, among other large deep learning user sites and although Intel already has been pitching its HPC centric and low power chips alike for deep learning, and one can argue this addition of custom hardware for specialized jobs creates a new focal point rather than a competitive technology to its own Xeon and Xeon Phi lines.
“The field has changed rapidly and what you’re seeing here is a bifurcation of the production lines. The HPC thing might not be the right tool for the space, Intel can take a gamble with taking it Xeon Phi line for HPC and applying it to this new market,” explains Rao. Nvidia is small enough that they want to build one architecture for HPC and get as much mileage out of it as possible.” In this case he is referring to Nvidia’s Kepler generation processors and the forthcoming Pascal chips, which have also shown impressive performance (with low precision capability) on deep learning workloads.
Nervana will continue with production of the TSMC 28 nm process chips coming out in Q1 to outfit its own cloud (which will continue—in fact, Rao says they’ve just picked up a couple of new big customers there) but the prospect of what Intel could bring to the development of future chips is quite striking.
“In the future, we would love to get access to the better 14 nm process technologies. As a startup, we had to use the most basic, bare-bones thing possible. But even with inferior process technology it is possible to beat a more general purpose processor.” Aside from better process access, there are also new memory technologies coming down the pike that could be promising for architectures like the Nervana Engine, including 3D-Xpoint. “We can think a lot bigger now,” Rao laughs.
Of course, the Nervana Systems story isn’t about hardware alone. As we noted in our extended article that appeared just hours before the Intel acquisition, the Neon software framework for deep learning on both GPUs and eventually, the Nervana Engine chips is a critical component. “Work on Neon will certainly continue,” Rao says. “Intel has a long history of building reference platforms to show customers the best way to do something or an easier way to build something.” He says that Neon was always set to become interoperable with TensorFlow and that Nervana, which is still operating as an independent company for all intents and purposes, is working on a graph-based computational backend for Neon. “We have some interesting constructs in our chips that are not available in other architectures, including how we handle distributed computing. For that, we have some extra hints in our version of the graph backend. You won’t get as high performance on a TensorFlow workload as you would with a Neon workload, and potentially TensorFlow will adopt some of that.”
Rao says they are going to be working closely with Google going forward because they are already an important customer for Intel. And one can make a guess that Intel is banking on custom hardware (just as they did with their bid for the FPGA market via the Altera acquisition) being the key to hyperscale web companies.
While some hyperscale web companies will likely keep building their own custom ASICs, others who have been working with Nervana already, including Baidu, are seeing value against general purpose processors. Rao says that in their work on GPUs for their own cloud cluster and development of the Neon software stack they found hidden instructions in the Nvidia architecture and exploited those in software, making the Nvidia chips run faster. This, as well as work in low precision training got their foot in the door at Baidu and will open new doors there and at other companies, including Google, Rao expects.
As a refresher, here is a deep dive on the hardware platform from Nervana as well as an overview of its business strategy–one that Rao says will remain consistent development and cloud-wise, at least for the time being.