Here we are, on the Friday before the flagship GPU Technology Conference hosted by Nvidia is set to kick off. And we are without a doubt excited in anticipation of what we conjecture will be a deluge of compute and networking hardware and puzzling over what this Omniverse really means (like many of you).
But for some reason, out thoughts keep coming back to software, and how Nvidia is building out an increasingly complete and ornate set of systems software to run HPC, AI, data analytics, visualization, and now world-simulating applications.
Nvidia has been creating drivers for GPUs for decades, and started building up commercial-grade HPC and AI frameworks, and the libraries and compilers that sit underneath them, a decade and a half ago. And Nvidia’s big acquisition here – big is a measure of importance, not expenditure – was The Portland Group back in July 2013. And since that time, it has built up stacks of HPC and AI frameworks, packaged up neatly in Kubernetes containers, that are the foundation of simulation, modeling, and machine learning applications.
Nvidia acquired Mellanox Technology in March 2019 as much for its networking software – literally the bits but also the expertise embodied in its employees – as for its switch ASICs and network interface cards, and almost immediately it acquired Cumulus Linux in May 2020 for much the same reason. Expertise is more important than any particular configuration of bits we call software. With Mellanox and Cumulus, Nvidia has a deep bench of expertise in interconnects and network operating systems, which complements its own very good SerDes design team and which, among other things, created the NVLink interface and the NVSwitch memory area network – well, that’s what it is, after all – to lash together GPUs into a kind of shared memory SuperGPU.
More recently, Nvidia acquired Bright Computing for its eponymous cluster configuration and management tool, and even more recently Nvidia has pushed out into storage with the rapid-fire acquisitions of distributed block storage provider Excelero and distributed object storage maker SwiftStack. We still think that Nvidia needs high performance file systems if it wants to own a complete stack, and we also think the Excelero and SwiftStack deals were more about getting people than they were about getting any particular piece of software-defined storage. For all we know, Nvidia is going to come up with a file system, with object or block underpinnings, all of its own and tune it for the HPC, AI, and data analytics workloads to chases in the datacenter.
So what else is Nvidia missing in its software stack? Well, a low-level operating system – which of course has to be Linux – to interface between its systems software and the iron, and a high-level operating system – which probably means Kubernetes, but Docker and Hashi Stack are also options – to containerize and pod microservices and denser applications. If not Kubernetes, Docker is probably a bad idea considering how far it has fallen from Grace as Kubernetes has ascended as the new platform abstraction of choice, but Hashi Stack is a reasonable alternative to Kubernetes for managing containerized applications at scale, even if it would be an expensive acquisition with HashiCorp having a market capitalization of $5.6 billion. (That’s down from $13.3 billion in the wake of HashiCorp’s initial public offering in December 2021, so it just got a lot cheaper.)
When rumors were swirling around after Pat Gelsinger jumped chief executive officer roles from VMware to Intel that Intel might buy VMware, we suggested that maybe Nvidia should buy VMware if the Arm deal fell through. Which it did, but Nvidia does not seem to be making a move to VMware, which is good because with a market capitalization of $47.7 billion as we go to press, VMware is way too expensive. And Intel has to save whatever tens of billions of dollars it can scrounge together to build the foundries it should have been building in the past decade. Nvidia could have made Arm bigger and stronger, we think, but will itself be more nimble against competitors in CPU and GPU compute as well as networking and now storage without having to digest and defend the Arm acquisition.
Flush with money – Nvidia had nearly $20 billion in cash in the bank as it exited its fiscal 2022 year in January – the company should add another block to its software stack by getting control of Linux and Kubernetes distributions.
IBM snapped up Red Hat back in October 2018 for $34 billion, so that company is off the negotiating table. Canonical, which is owned by billionaire Mark Shuttleworth, has put together a very respectable Ubuntu Server distribution of Linux with all kinds of goodies like bare metal and Kubernetes support, and would fit nicely with the Nvidia stack.
But, there are two executives who sell stuff into the datacenter that also see green. One of them, of course, is Jensen Huang, co-founder and chief executive officer at Nvidia and the other one is Melissa Di Donato, chief executive officer at SUSE, the longtime supplier of the enterprise Linux and now a fairly complete Kubernetes platform.
(The green for Nvidia symbolizes growth and change, and they eye is looking to the future; for SUSE, the green comes from the chameleon, which adapts to its environment, and if you look closely, that chameleon has Golden Ratio proportions and is downright friendly even though its eye misses nothing.)
SUSE is perhaps a better fit than Canonical, and is actually acquirable because after a long line of acquisitions – it has been bought by Novell in 2003, Attachmate in 2010, Micro Focus in 2014, and EQT Partners, a Swedish private equity firm, which bought SUSE for $2.5 billion in 2018. In May 2021, EQT took SUSE public on the Frankfurt Stock Exchange in Germany, raising $1.35 billion and giving SUSE a market capitalization of $6.1 billion at the time.
We know about SUSE in terms of its financials, because it went public, and have compiled the salient financial details it gives out in this table:
These numbers allow us to look at SUSE and make comparisons over the trailing twelve months, and if you do that, SUSE had $596.8 million in sales, up 15.1 percent, and had earnings before income taxes, depreciation, and amortization (EBITDA) of $203.7 million, up 3.6 percent. The core Linux business accounted for $517.4 million in sales in the trailing twelve months, up 10 percent, and the emerging business, mostly Kubernetes with Rancher management extensions, Neuvector container security, and Harvester hyperconverged distributed storage for container platforms. (Harvester is based on the Longhorn distributed block storage system developed by Rancher Labs, which SUSE acquired in July 2020. (We covered Rancher in detail a number of years ago.)
Nvidia would not buy SUSE because it needs the revenue stream. At least it doesn’t need it now. But rather Nvidia would do such a deal, with a decent premium over the company’s current $5.1 billion market capitalization, because it wanted a team of 2,400 Linux experts, including engineers and sales people, who could be the foundation of a full Linux-Kubernetes stack for all of the higher level software that Nvidia has already created and even more that it will create in the years to come.
We know about the tight partnership between Nvidia and VMware, and we are well aware that VMware server virtualization is the substrate on a majority of the systems in the enterprise world. And Nvidia is packaging up its software so they can be packaged up to run on the Tanzu variant of Kubernetes running atop the ESXi hypervisor and in conjunction with NSX virtual networking and vSAN virtual storage from VMware. We also know that Nvidia has been working with VMware on its “Project Monterrey” port if the ESXi hypervisor to run on the BlueField family of DPUs from Nvidia. (Three times my brain wanted to type “moneytree” there. . . . Hmmm.)
So what. Nvidia will partner with competitors, as it has been doing for years. It makes servers and it helps others make servers. It can control and optimize its own Linux operating system and Kubernetes distribution while at the same time allowing it to run on Red Hat Enterprise Linux and Ubuntu Server and their respective Kubernetes, or tune it for the Linux distros and Kubernetes container services available on the biggest clouds.
One doesn’t negate the other. But there are obvious advantages in terms of performance, ease of deployment, and pricing control that come from having a complete stack. Not everyone will want to pay the VMware premium – which is high, mind you – as Nutanix learned the hard way in the hyperconverged storage area, compelling it to create its own variant of KVM. VMware ESXi is still underpinning a lot of Nutanix storage, but for those who are price sensitive, the Acropolis stack from Nutanix helps cushion the blow of the software price tag for HCI. This is no different.