Nvidia is not rich enough – or dumb enough – to build a cloud to rival the likes of Amazon Web Services, Microsoft Azure, or Google Cloud. But it is smart enough to use these vast compute and storage utilities to its own advantage and to make money selling services on top of the infrastructure they build that is in turn based on its own componentry.
That, in a nutshell, is what the DGX Cloud announced today at the Spring 2023 GPU Technical Conference does.
Nvidia co-founder and chief executive officer Jensen Huang in talking last month about the GPU maker’s quarterly earnings surfaced the plan for the Nvidia DGX Cloud, essentially a call for putting the company’s DGX AI supercomputer hardware and accompanying software – in particular its expansive Enterprise AI suite of software – onto the public cloud platforms for enterprises to use.
Huang at the time didn’t dive too deeply into the DGX Cloud idea, but argued that it was a way to make AI more accessible to enterprises who didn’t have the millions of dollars to spend to bring an AI supercomputer into their own environments to run such jobs as large language models. And there are many of those such organizations. With DGX Cloud, they could hop onto one of the major cloud service provider’s environments, access Nvidia’s hardware and software, and run their large and complex AI training workloads.
A month later, at this week’s GTC 2023, Huang and other Nvidia executives are officially unveiling DGX Cloud, with Oracle Cloud and Equinix datacenters – not to mention Nvidia’s own on-premises DGX SuperPod datacenter platform – being the first to host the hardware-software package and Microsoft Azure and Google Cloud coming in the future.
The neat bit about DGX Cloud is not that there is a certified on-premises and cloud stack to run Nvidia’s AI hardware and software. It is that you pay Nvidia to do so in a kind of SaaS model – and Nvidia gets to sell either you or the clouds the parts to build the infrastructure.
In and of itself, it’s the latest attempt to democratize AI, to take it out of the realm of HPC and research institutions and put it in reach of mainstream enterprises, which are more than eager to take advantage of the business advantages the emerging technology can deliver.
For Nvidia, DGX Cloud’s AI-as-a-service represents a strong shift to a cloud-first strategy and an understanding that – like other component makers – it is now as much a software company as a hardware maker and the public cloud is a natural path toward making that software accessible and, more importantly, monetizing it.
It’s an important next step for a company that more than a decade ago put AI at the center of its go-forward strategy, building a roadmap with AI at the core. Nvidia in 2016 launched DGX-1, its first deep learning supercomputer. The fourth generation of the systems rolled out last year. In 2020 came the first DGX SuperPODs and a year later Nvidia introduced AI Enterprise, a software suite of frameworks, tools, and a not-so-small dose of VMware’s vSphere.
AI Enterprise put a spotlight of the growing importance of software for Nvidia – reflecting a similar trend at other component makers – a company that now has more employees working on software than hardware.
With DGX Cloud, Nvidia now has another way to deliver all of that to enterprises that want to leverage generative AI tools like the wildly popular ChatGPT from OpenAI – via Microsoft – in their workflows but don’t have the resources to scale the infrastructure within their datacenters to support it. They can now access it via the cloud, with all of its scalability and pay-as-you-go benefits.
“For many years now, we’ve been working with enterprise companies creating their own models to train with their own data,” Manuvir Das, vice president of enterprise computing at Nvidia, told journalists in a pre-GTC meeting. “The last few months, the rise in popularity of services like ChatGPT that are based on very, very large GPT models, where one model is used by millions of people every day. When we work with enterprise companies, many of them are interested in creating models for their own purposes with their own data.”
The cloud – and Nvidia’s technology in the cloud – offers the way to do this, Das said.
“It’s taking that same model but now hosting it within the public cloud,” he said. “What we’ve done over the years with DGX is not just a state-of-the-art supercomputer, but we’ve built a software stack that sits on top of it that turns this into a turnkey training-as-a-service. You just provide your job, point to your dataset and then you hit ‘go’ and all of the orchestration and everything underneath is taken care of. In DGX Cloud now, the same model is available on infrastructure that is hosted at a variety of public clouds. It’s the same interface, the same model for running your training. It’s also available on premises, so it’s a true multicloud, hybrid cloud solution.”
Das added that while Nvidia offers a set of pre-trained models, “the idea of that is to give an enterprise company a head start so they don’t have to train those models from scratch. The purpose here is for them to customize the models. That’s the basis for our service.”
It’s also making the cloud providers work for Nvidia. They become the infrastructure that Nvidia will use to get its AI hardware and software to enterprises – and make money from it along the way. Only the cloud providers can offer access to the tens of thousands of GPUs needed to run the massive multi-node AI training operations. Nvidia doesn’t have the money to build a similar cloud-based infrastructure that delivers that kind of GPU power. Neither does AMD. The cloud is the path forward.
The Oracle Cloud Infrastructure (OCI) Supercluster offers bare-metal compute, a high-end RDMA network, and high-performance local and block storage that can scale to more than 32,000 GPUs, according to Nvidia. Oracle last year boasted about its partnership with Nvidia to add the GPU maker’s AI software applications and middleware to its cloud offerings.
According to Das, the pricing models are owned by Nvidia and customers pay Nvidia, though they access the technology through any cloud marketplace where it’s offered. Enterprises can access DGX Cloud instances starting at $36,999 per instance per month. Each instance includes eight Nvidia H100 or A100 80 GB GPUs for up to 640 GB of memory per GPU node.
The compute resources are dedicated and not shared with other tenants in the cloud, with the network also isolated between tenants.
Nvidia is rolling out a number of organizations that already have started working with DGX Cloud, including Amgen, ServiceNow, AT&T, Getty Images, and Shutterstock.
As part of the cloud offering, Nvidia is introducing a platform called Nvidia AI Foundations, a set of cloud services to enable enterprises to create, customize, and deploy custom generative AI models. The services include NeMo for text-to-text operations like ChatGPU, and BioNeMo for large language models aimed at drug discovery. Both already are offered by Nvidia, which also is now adding Picasso, a model for images, video, and 3D models that also can be imported into Omniverse, Nvidia’s metaverse platform.
Nvidia isn’t just pushing DGX and Enterprise AI software into the cloud. In the second half of the year, the company will roll out the Omniverse Cloud in Azure, a platform-as-a-service that builds on the Omniverse infrastructure-as-a-service that Nvidia already offers. It will be a subscription service based on Nvidia’s OVX systems designed to create 3D virtual worlds to create an immersive experience. Nvidia also will integrate its Omniverse technology with Microsoft 365, integrating it with such Microsoft tools as Teams, OneDrive, and SharePoint, with what Nvidia calls a “network of networks” connecting Omniverse to Azure.
Sign up to our Newsletter
Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.