Nvidia Tesla GM Moves To IBM To Steer HPC Efforts

IBM has been a pioneer in large scale, hybrid computing and has staked a substantial portion of the future of its Power platform, and the ones that partners are building in conjunction with it through the OpenPower Foundation, on the idea that various kinds of compute, storage, and interconnects will be used to fabricate systems that are precisely tailored to efficiently run specific workloads. In a sense, the era of general purpose computing is over.

To bolster the adoption of hybrid, high performance systems that have the Power processor at its core, IBM has tapped, Sumit Gupta, the long-time general manager of Nvidia’s Tesla accelerated computing unit, to become its new vice president of OpenPower high performance computing. Gupta will report to Ken King, general manager of IBM’s OpenPower alliances, and will work alongside of Dave Turek, who was formerly vice president of OpenPower technical computing and also a seasoned HPC expert who is now vice president of HPC engagement.

While at Nvidia, Gupta was instrumental in taking Nvidia’s GPU processors and turning them into general purpose compute engines, a business that accounted for $279 million in revenues for Nvidia in its fiscal 2015 and that will probably exceed $400 million this year. Both Nvidia and IBM are eager to sell Power-Tesla hybrid systems into a traditional HPC and cloud service provider market that Nvidia estimates has a total addressable market of $5 billion. At IBM, Gupta will be responsible for driving product requirements, doing market segmentation, performing competitive analysis, mapping out ISV strategy, and basically running the overall HPC business at IBM that is now fundamentally focused on the Power platform.

Gupta just started his first day at IBM on Monday, but took some time to sit down with The Next Platform to talk about his new role and how he sees the OpenPower initiative that he has been a key part of from the beginning transforming both IBM’s system business and the HPC industry at large.

Timothy Prickett Morgan: In your role running the Tesla and CUDA computing business at Nvidia, you obviously saw the initial uptake of GPU acceleration among the traditional HPC community. But as we have discussed many times now over the years, acceleration of all kinds is not limited to modeling and simulation workloads, but has expanded out to include all kinds of parallel applications, including media transcoding, deep learning for everything from text to voice to image recognition, risk modeling for financial transactions, database acceleration, and soon even boosting the performance of Java applications. At IBM, will you seek to leverage OpenPower technologies for all of these areas, or will you be focusing on traditional HPC as we know it?

Sumit Gupta: Power brings value to all kinds of data center workloads and workflows. We will of course focus on the technical computing market, but also work with developers and customers in the analytics market, enterprise computing and cloud computing markets. As you know, Google is already a member of OpenPower and we are working with several consumer web and mobile companies to enable their workloads on Power. As you are aware, Rackspace Hosting has announced their work combining OpenPower and Open Compute design concepts with OpenStack.

TPM: Yeah, we have discussed the Google system designs with Gordon MacKean, the senior director in charge of server and storage systems design at Google, and his boss, Urs Hölzle, who is senior vice president of the Technical Infrastructure team at the company, about why Google might switch from X86 to Power for its compute engines. There are some interesting possibilities there.

With several large future supercomputers being based on the combination of IBM Power and Nvidia Tesla processors, GPU acceleration has obviously become important to IBM. What roles do you think field programmable gate arrays, digital signal processors, and other kinds of accelerators will play as the OpenPower partners bring hybrid systems to bear on tough compute jobs?

Sumit Gupta: IBM firmly believes in workload optimization using accelerators. The attributes of the workload and programmability determines what type of accelerator a client uses. Through the OpenPower Foundation, we are collaborating with a variety of accelerator partners. We continue to invest in software to develop new algorithms, better mathematical models, and developer tools for the OpenPower ecosystem. Ultimately, the marketplace will render its judgment on what types of technologies should be used for different types of workloads, and IBM will respond accordingly.

TPM: Having sat on the other side of the OpenPower Foundation, even before it was formally launched in August 2013, and having spearheaded the development of the CUDA programming environment for Tesla compute and, equally importantly, a portfolio of hundreds of applications that can be accelerated by them, do you think that the OpenPower collective can build a similar programming environment that brings together Power chips and other forms of compute and storage under a single programming paradigm to make it easier for application developers to get their code running in hybrid fashion on OpenPower machinery?

Sumit Gupta: Application developers prefer flexibility in choosing programming environments, languages, and models. The OpenPower ecosystem will continue to expand the options available to developers in a focused way for each market that we target. Developer adoption is vital to the success of any processor and it’s a strong focus for us at IBM through our own research and development investments.

TPM: The HPC community and its analogs in the enterprise and among hyperscale companies and cloud builders tend to focus on compute, but interconnects and storage are two areas that are becoming more important. What role do you see for expanding interconnect and storage options within the partners in the OpenPower ecosystem?

Sumit Gupta: IBM has material engagements with OpenPower members on interconnect and storage. We are leveraging our R&D and that of the OpenPower ecosystem to provide optimized solutions for the datacenter that incorporate high performance compute, with interconnect and storage. There is also a huge investment at IBM in software defined infrastructure that will complement hardware innovation in the OpenPower ecosystem. The OpenPower Foundation fosters innovation among our interconnect and storage partners.

TPM: What are the most important advantages that OpenPower has over the competition against Intel’s Xeon and Xeon Phi platforms for HPC customers and what advantages will that same ecosystem bring against a collective of vendors who will presumably be able to supply 64-bit, server-class ARM processors at some point in the next year?

Sumit Gupta: OpenPower is an open ecosystem of partners that are committed to providing solutions to improve data center performance and efficiency. IBM and several others are building high performance Power-based SoCs. We are partnering with accelerator companies like Nvidia and interconnect companies like Mellanox Technologies to build better, faster, more energy efficient solutions for the datacenter. We are confident that the Power architecture will provide compelling and competitive performance and energy efficiency as compared to any other architecture in the datacenter.

Nvidia Tesla GM Moves To IBM To Steer HPC Efforts

Sign up to our Newsletter

Be the first to comment

Leave a Reply Cancel reply