It does not happen very often in the history of business that an orthogonal product is invented that almost immediately doubles the revenue pool of a market and has the prospect of tripling it over the next handful of years. But that is precisely what GenAI has done for the information technology sector.
It is an amazing thing to behold, a second wave of computing overlaying and possibly replacing a lot of the functionality of the original wave.
The question is, will this GenAI proliferation just allow the big to just get bigger, or will there be a proliferation of suppliers in the AI stacks of the world, a true Cambrian explosion? We are optimistic at The Next Platform, and favor competition and diversification wherever and whenever it is possible. But economic pressure always forces a consolidation and a die-off, and only a few – we will not call them the strongest – survive. Sometimes it is the small or the clever is the one that survives.
If you look at the ever-increasing amounts of money that are pooling in the AI market, both venture funding for startups and private equity for projects, there has never been a better time to pitch a new technology that has some hope of improving the compute, storage, or networking hardware that is the foundation of large language models. Every day, there seems to be another company that is raking in the dough from investors to provide its variation on the AI theme.
While only a few AI platforms can survive to have a reasonable share of the vast sums that are being thrown at AI infrastructure projects over the long haul, the good news is that the IT sector has learned a few lessons and despite the desire of companies to have vertically integrated stacks, from chips all the way up through racks to software, there are very few such stacks and still others can emerge that will allow of different compute, networking, and storage to plug into these devices.
As was the case with traditional supercomputing for running HPC simulations and models, there is a nationalistic aspect to all of this, too, and hence the idea of data sovereignty and computational sovereignty – having a nation control, through its labs or its indigenous chip and system makers, the hardware supporting its codes – is top of mind outside of the United States, which utterly dominates AI processing in terms of architecture and capacity. But that won’t last forever. The rest of the world will catch up and spend its share, and countries and their companies will often pay a premium to control their own fates and not have projects delayed or denied because a foreign country has decided to deny them access to technology for national security reasons.
It is pretty clear that AI is a national security issue for the major nations of the world. The issue is that designing AI hardware from top to bottom is an expensive proposition, and there is really only one foundry that can make and package the compute engine and networking chips, only three suppliers of HBM and high-end DRAM, and the capacity for main memory and flash memory and advanced packaging is pretty tight.
Given this, it will be very hard for the upstarts to wedge their way into the market, even if they are backed by SoftBank/Arm or a sovereign wealth fund in the Middle East or the South Korean or Japanese government. China can do what it has been doing: Make it up in volume with the chips that SMIC can make and being crappy about getting its hands of memory. Eventually, China will come up with a new memory scheme that it can manufacture – stacked LPDDR memory or Z-Angle memory or a large number of banks of HBM3E memory attached to relatively modest AI compute engines and just scale it all out.
But know this: Platforms always get undercut by newer, cheaper platforms, and in recent decades, distributed computing has been the means of doing this. We have no reason to believe it will not happen again – particularly with the enormous revenues and even more amazing profits that Nvidia is generating. IBM was raking in money hand over fist with the System/360 and then the System/370 mainframes six and then five decades ago, too, and vanquished just about all of its competition in mainframes except a very pesky Amdahl (eventually teamed with Fujitsu) and a persistent Hitachi – both running IBM’s own mainframe software because the antitrust authorities of the world said IBM had to.
This pattern may repeat itself, with governments of the world deciding that the CUDA-X stack has to be able to run on other AI accelerators, or that it is absolutely permissible to create a bug-for-bug compatible clone of an Nvidia GPU. Which would be about the time Nvidia decides to roll out a new Groq-ish, non-GPU architecture for inference some years hence, should this come to pass.
It is not hard to see what is already happening. The hyperscalers and the cloud builders can afford to make their own CPUs and their own AI XPUs, and they can pay the iron price – hell, the gold price – for HBM memory or CoWoS-L packaging or whatever they need to make an accelerator that is definitely less expensive than paying for a “Blackwell” or “Rubin” GPU that can do so much more than an AI XPU. (Run 64-bit simulations, crunch data analytics, do visualization and graphics, etc. . . . ) They started with Arm-based server CPUs several years ago and are moving into AI XPUs. (Meta Platforms seems to have an interest in RISC-V CPUs and GPUs based on its acquisition of Rivos last fall, but it is not clear how serious it is. This may be more about acquiring a team than endorsing an architecture.)
The hyperscalers and cloud builders can do this not only because they make money on advertising and media, but because they also need the kind of volumes that make it economical to design and manufacture a compute engine independent of an Intel, AMD, or Nvidia. These companies are, in effect, platform providers akin to the proprietary minicomputer and Unix system makers of days gone by, and in many cases, they have integrated hardware and systems software stacks just like IBM, Sun Microsystems, Digital Equipment, Data General, Hewlett Packard, Siemens, Fujitsu, Hitachi, NEC, and a slew of others used to build. The clouds are OEMs for rental access for IaaS compute and the hyperscalers are OEMs for rental access for SaaS compute and sometimes PaaS compute, in a way.
It is this vertical integration that matters, and any AI compute engine designer that hopes to have a business has to somehow get its devices into an integrated platform like this at a scale that makes it economically feasible. The other kind of vertical integration – being one of many suppliers in an Open Compute rackscale system or its analogs from AMD, for example – is another way.
But ultimately, this is about volume, and that is why you see Anthropic and Meta Platforms using datacenters full of TPUs or Anthropic and OpenAI using AWS Trainiums. And it is why OpenAI has tested Google TPUs and its still developing its “Titan” inference chip in conjunction with Broadcom to keep the heat on Nvidia and AMD and their respective GPUs and on Google and AWS as well. This is why the Tesla-SpaceX-xAI triumvirate is blowing the dust off its “Dojo” accelerator and promising to do its own AI accelerator. Just saying that is a negotiating tactic to get cheaper GPUs.
Everybody is competing and cooperating with everyone else, because the scale of compute is the advantage right now. And companies will get scale any way they can. And despite what anyone says, the limiting factor is not getting compute engine wafers back from the fab, it is getting HBM memory and the advanced packing to attach it to compute engines and getting the power to turn the on.
What we know for sure is this: The economic forces that always define markets will not allow there to be only one vendor of AI stacks. But those same economic forces will not allow there to be dozens of them, either. It is the way of all things.
Having said all that, we are keeping an eye out for differentiated innovation in both AI models and in XPU architectures to see if some upstart can shake things up. We strongly suspect that many of them can, and it is these innovations that drive the economic forces that in turn shape markets. There is more tectonics ahead.