The related but distinct HPC and AI markets gave Nvidia a taste for building systems, and it looks like the company wants to control more of the hardware and systems software stack than it currently does given that it is willing to shell out $6.9 billion – just about all of the cash it has on hand – to acquire high-end networking equipment provider and long-time partner Mellanox Technologies.
Every public company is in play – that is the nature of being publicly traded – but since the fall of 2017, when it came to light through activist investor Starboard Value that Mellanox refused the advances of Marvell (which spent $6 billion to acquire Cavium instead), Starboard has been agitating for Mellanox to find a suitor so the activist investor can cash out and get rich. These are not necessarily sound reasons to have one company acquire another, but that is the nature of this part of capitalism.
We are hopeful that the combination of Nvidia and Mellanox will result in something in the datacenter that is greater than the sum of its parts. And in fact, you will recall that we actually advocated for getting the whole OpenPower band together – IBM, Nvidia, Mellanox, and Xilinx – to create a vertically integrated hardware component maker that could counterbalance what was the growing hegemony of Intel in the datacenter at that time. (Issues with its 10 nanometer chip manufacturing has humbled Intel a bit in the past year and a half, though.) This OpenPower collective might have been a lot less expensive to put together back in early 2017 compared to what Nvidia is paying just to buy Mellanox today.
While we think that Nvidia can figure out how to leverage Mellanox to become a bigger player in the datacenter, there are no shortages of examples of compute and networking mashups that did not make as much sense in actuality as they did on paper when they were first done. We can think of several different ways this might play out, and we will talk about a few of them in a moment. One thing is for sure: Neither Nvidia nor Mellanox will say much specific about their plans until the deal is closed sometime before the end of the calendar year.
When AMD bought microserver startup SeaMicro back in February 2012 for $334 million, it was looking to add networking to its compute engines, but it also ended up competing with the server customers that bought its Opteron processors. The jury is still out on Intel’s acquisition of Fulcrum Microsystems for an undisclosed sum back in July 2011 – well, actually, Fulcrum Ethernet ASICs have not been seen the deal went down – and it is similarly unclear how Intel will ultimately leverage the “Gemini” and “Aries” interconnect businesses that it bought from Cray for $140 million back in April 2012. Intel’s acquisition of InfiniBand vendor QLogic back in January 2012 for $125 million has fared better, giving the world’s dominant maker of server processors something to sell against the InfiniBand fabrics from Mellanox and supposedly some of the secret sauce in the Aries interconnect was supposed to be grafted onto future Omni-Path interconnects from Intel. (Intel insists that Omni-Path is InfiniBand compatible with but distinct from InfiniBand.)
It made sense to a certain degree to combine Marvell and Mellanox, but that was really about eliminating some networking competition as well as augmenting compute capabilities between the two companies, who are both licensees of the Arm architecture from Arm Holdings, the chip unit of the SoftBank conglomerate in Japan. It never made much sense for Microsoft to be a vendor of Mellanox gear, which is sold to its rivals in the hyperscale and public cloud arena as well as to its millions of Windows Server customers. The case for Mellanox and Xilinx to team up was, we think, only interesting in a broader vertical integration of a datacenter hardware stack – that stack still needed CPU and GPU compute. And Intel acquiring Mellanox for $5.5 billion to $6 billion, as was rumored as 2019 got underway, only made sense if Intel was openly admitting that its Ethernet business (with no presence in switching but with lots of server adapters) needed to be revitalized from scratch and the Omni-Path onload model of networking was not as good as the InfiniBand offload model (something that Intel vehemently denies). Intel buying Mellanox might have been as much about keeping technology out of play among its competitors in the HPC and AI markets as having different technology to sell itself. All we know is that supercomputer maker Cray decided to go its own way with the “Slingshot” HPC-style Ethernet interconnect it is cooking up, breaking free from dependence on either Mellanox or Intel. Don’t get the wrong idea: Cray will happily resell Intel or Mellanox interconnects if that is what customers want.
We don’t know how close Intel was to actually doing a deal with Mellanox – predictably, the two companies did not comment on any possible deal – but Nvidia clearly wanted Mellanox more than Intel did, based on the extra $900 million to $1.4 billion it paid over the rumored bags of cash that Intel had set aside for the supposed acquisition of Mellanox.
What we can say is that Mellanox has come a long way in 2018 to grow revenues and to squeeze some profits out of that growth, which we discussed back in January. We will also miss the very detailed financial presentations that Mellanox gave to describe its business, which enabled use to see the trends in InfiniBand and Ethernet networking in the large datacenters that we care about here at The Next Platform. But with Mellanox being a $1 billion company and Nvidia being an order of magnitude larger, we do not expect for this fine-grained detail on what Mellanox is selling to be presented once Nvidia takes over Mellanox after shareholders and regulators give it the nod.
A side note: Intel was going to have a much harder time getting regulatory approval, we think, by having total control of InfiniBand networking had it bought Mellanox, but even then, InfiniBand only drove half of the revenues at Mellanox, so it is a relatively small business compared to the total addressable market for datacenter networking. And we never liked the idea of having the two InfiniBands under the same roof. We like competition, and so do customers, since it drives innovation as well as price/performance. To be blunt, the relative monopolies that Intel has enjoyed in CPU compute and Nvidia has enjoyed in GPU compute made them both rich, but prices have not come down as fast as performance has gone up. And now both Intel and Nvidia are facing credible competition – both from AMD, as it turns out. Nvidia, thanks to the CUDA software stack (which AMD cannot match), has a much more unassailable position than does Intel with Xeon CPUs (where an X86 application just runs on either a Xeon or an Epyc).
To do the deal, Nvidia is tapping into its $7.4 billion in cash, rather than take on debt, and will have approximately $500 million left over when the deal is done. But the other thing to consider is that the combined companies should generate around $13 billion in revenues and maybe $3.5 billion in net income in fiscal 2020, which ends next January. (That’s our estimate.) Basically, Nvidia is throwing off enough cash that it will be able to build the pile up pretty quickly. So even though Mellanox is commanding a pretty hefty premium – Mellanox had a market capitalization of around $2.2 billion, or around 2.5X revenues, before Starboard started pushing for the company to find a suitor, and the deal is around 3.1X that market capitalization and about 5.5X expected revenues for 2019. But here’s the thing: It would cost billions of dollars to make a networking business or to buy a different one, and Nvidia co-founder and chief executive officer Jensen Huang understands that networking, more than compute, is the future of the datacenter. Or, to be more precise, it is the hard part and it is worth the money to invest in Mellanox now than to wish it had some years hence.
In a conference call with journalists deep in the HPC and AI sectors, Huang explained at length the reason why Nvidia is doing the Mellanox acquisition. Because Huang is one of the more eloquent people in the IT sector, there is no reason not to quote directly and completely:
“The strategy is doubling down on datacenters, and we are combining and uniting two leaders in high performance computing technologies. We are focused on accelerated computing for high performance computing, and Mellanox is focused on networking and storage for high performance computing, and we have combined the two companies under one roof. Our vision is that datacenters are the most important computers in the world today, and that in the future, as workloads continue to change – which is really triggered by artificial intelligence and data analytics – that future datacenters of all kinds will be built like high performance computers. Hyperscale datacenters were really created to provision services and lightweight computing to billions of people. But over the past several years, the emergence of artificial intelligence and machine learning and data analytics has put so much load on the datacenters, and the reason is that the data size and the compute size is so great that it doesn’t fit on one computer. So it has to be distributed on multiple computers and the high performance connectivity to allow these computers to work together is becoming more and more important. This is why Mellanox has grown so well, and why people are talking about SmartNICs and intelligent fabrics and software defined networks. All of those conversations lead to the same place, and that is a future where the datacenter is a giant compute engine that will be coherent – and it will allow for many people to still share it – but allow for few people to run very large applications on them as well. We believe that in the future of datacenters, the compute will not start and end at the server, but extend out into the network and the network itself will become part of the computing fabric. In the long term, I think we have the ability to create datacenter-scale computing architectures.”
The immediate question is what will Nvidia do with Mellanox once it owns it? Well, the most obvious answer is not anything that screws up the business of selling switch and adapter ASICs, as well as finished switches and adapters, to hyperscalers and cloud builders that buy directly or large enterprises and HPC centers that work through OEM partners. Nvidia sells server components as well as complete systems today, and Mellanox does the same for switching products, so their philosophies of co-opetition with their channels are consistent.
Above and beyond that, we expect a few things.
We have NVM-Express over Fabrics transforming the nature of storage in the datacenter, and we think it is not a stretch at all for Nvidia to take the NVLink memory atomics protocol that is used to lash the memories of multiple GPU accelerators together inside of a server and have it span across multiple servers over an RDMA-capable InfiniBand or Ethernet switched fabric. This is the next logical thing, and more importantly, could open up NVLink attachment for GPUs to CPUs other than IBM’s Power8+ and Power9 processors.
We also think that the world wants an Arm server processor that has tight couplings with interconnects and GPU and FPGA accelerators, and it would be interesting to see Nvidia actually grab the Neoverse roadmap from Arm and start driving along it, providing the CPU compute alongside the GPU compute. This makes as much sense, we think, as Nvidia investing in Tegra Arm processors for embedded and client uses. Huang did not seem interested in this idea, even if the server CPU industry could use some more competition and a truly big player in Arm server chips:
“The great discovery that we made 26 years ago was that parallel computing was a show stopper, but that accelerated computing was the path forward,” explained Huang in reference to that CPU question. “And the reason for that is that there is always sequential code and parallel code, and you cannot parallelize everything – not everything in life, not in nature, and there are some things that just depend on the previous state and there is no other way around it. These two types of processing are going to be here to stay. With accelerated computing, we don’t suffer from Amdahl’s Law – we obey it, and the thing that you don’t accelerate becomes the critical path. We believe in fast CPUs, and that is why we work with all of the world’s fastest CPU makers – IBM, Intel, AMD, Arm. Most people think we are antagonistic, but it is just not true. We try to support everybody’s CPU the best we can, and the reason we have not built a CPU is because I want to focus our R&D and engineering on where we can get X factors of improvement. I think that is a sensible thing. If we would pour a lot of money into CPUs, the X factor that you get after five years is 15 percent. And these companies battle over that 15 percent, and since they are doing such a good job, we can focus on the places where people are not investing that get big X factors.”
So that’s that. Until it isn’t. We shall see.
Sign up to our Newsletter
Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.