Nervana CEO on Intel Acquisition, Future Technology Outlook

Following yesterday’s acquisition of deep learning chip startup Nervana Systems by Intel, we talked with the company’s CEO, Naveen Rao, about what plans are for both the forthcoming hardware and internally developed Neon software stack now that the technology is under a much broader umbrella.

Media outlets yesterday reported the acquisition was $350 million, but Rao tells The Next Platform it was not reported correctly and is actually more than that. He was not allowed to state the actual amount but said it was quite a bit higher than the figure given yesterday.

Nervana had been seeking a way to expand its business, including the possibility of raising another round of funding to complement the existing $28 million. Acquisition was already on the table and the company had been in touch with Intel for two years. This is not surprising given the use of Nervana technology at Baidu, among other large deep learning user sites and although Intel already has been pitching its HPC centric and low power chips alike for deep learning, and one can argue this addition of custom hardware for specialized jobs creates a new focal point rather than a competitive technology to its own Xeon and Xeon Phi lines.

“The deep learning market is different than the HPC market,” the Nervana CEO says. “The primitives are different and there’s an acknowledgment of that now. There used to be the sentiment that you could do this on general purpose-ish hardware, but we’re realizing that’s not the case now. GPUs were the proof point that if you made some tweaks to HPC hardware you could get a lot of performance and we did show some of that with low-precision training early on. A lot of people then, including Andrew Ng at Baidu, didn’t think it would work. Now Baidu is leveraging our low-precision GPU work.”
“The deep learning market is different than the HPC market,” the Nervana CEO says. “The primitives are different and there’s an acknowledgment of that now. There used to be the sentiment that you could do this on general purpose-ish hardware, but we’re realizing that’s not the case now. GPUs were the proof point that if you made some tweaks to HPC hardware you could get a lot of performance and we did show some of that with low-precision training early on. A lot of people then, including Andrew Ng at Baidu, didn’t think it would work. Now Baidu is leveraging our low-precision GPU work.”

“The field has changed rapidly and what you’re seeing here is a bifurcation of the production lines. The HPC thing might not be the right tool for the space, Intel can take a gamble with taking it Xeon Phi line for HPC and applying it to this new market,” explains Rao. Nvidia is small enough that they want to build one architecture for HPC and get as much mileage out of it as possible.” In this case he is referring to Nvidia’s Kepler generation processors and the forthcoming Pascal chips, which have also shown impressive performance (with low precision capability) on deep learning workloads.

Nervana will continue with production of the TSMC 28 nm process chips coming out in Q1 to outfit its own cloud (which will continue—in fact, Rao says they’ve just picked up a couple of new big customers there) but the prospect of what Intel could bring to the development of future chips is quite striking.

“In the future, we would love to get access to the better 14 nm process technologies. As a startup, we had to use the most basic, bare-bones thing possible. But even with inferior process technology it is possible to beat a more general purpose processor.” Aside from better process access, there are also new memory technologies coming down the pike that could be promising for architectures like the Nervana Engine, including 3D-Xpoint. “We can think a lot bigger now,” Rao laughs.

Of course, the Nervana Systems story isn’t about hardware alone. As we noted in our extended article that appeared just hours before the Intel acquisition, the Neon software framework for deep learning on both GPUs and eventually, the Nervana Engine chips is a critical component. “Work on Neon will certainly continue,” Rao says. “Intel has a long history of building reference platforms to show customers the best way to do something or an easier way to build something.” He says that Neon was always set to become interoperable with TensorFlow and that Nervana, which is still operating as an independent company for all intents and purposes, is working on a graph-based computational backend for Neon. “We have some interesting constructs in our chips that are not available in other architectures, including how we handle distributed computing. For that, we have some extra hints in our version of the graph backend. You won’t get as high performance on a TensorFlow workload as you would with a Neon workload, and potentially TensorFlow will adopt some of that.”

Rao says they are going to be working closely with Google going forward because they are already an important customer for Intel. And one can make a guess that Intel is banking on custom hardware (just as they did with their bid for the FPGA market via the Altera acquisition) being the key to hyperscale web companies.

While some hyperscale web companies will likely keep building their own custom ASICs, others who have been working with Nervana already, including Baidu, are seeing value against general purpose processors. Rao says that in their work on GPUs for their own cloud cluster and development of the Neon software stack they found hidden instructions in the Nvidia architecture and exploited those in software, making the Nvidia chips run faster. This, as well as work in low precision training got their foot in the door at Baidu and will open new doors there and at other companies, including Google, Rao expects.

As a refresher, here is a deep dive on the hardware platform from Nervana as well as an overview of its business strategy–one that Rao says will remain consistent development and cloud-wise, at least for the time being.

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.

Subscribe now

3 Comments

  1. Well nVidia will be feeling the heat and might be out of it pretty soon. I hope they do not focus solely on TensorFlow though I prefer non-company-dominated open platforms like Torch or Caffe instead of a google controlled software architecture.

    • At this Point Nvidia is totally dominatiing the market. They have more powerful Products than any intel offering by a factor 3-4x for low precision training (21 TFLOPS FP16).

      Meanwhile the inferencing hardware is able to achieve 44 TOPS, compare that to the Xeon Phi which peaks at 6 TOPS.

      And this new hardware from Nervana is 3 quarters away and is totally unproven.

      It must suck when the facts slap you in the face like this 🙂

      • Well nVidia has to bring that power consumption waaaaaaaaaay down before they ever become a serious contender compared to DSP, FPGAs that go straight into the end application hardware.

        In the data center we will see they might have a lead now but that can change quickly. All hyper scalers are seriously on the move for alternatives and again here it is their power consumption which probably mainly the problem.

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.