How Long Before AI Servers Take Over The Market?

Timothy Prickett Morgan

7 months ago

When hyperscalers and cloud builders think about their infrastructure, they talk about megawatts and they think about the mix of serving and storage and the total capacity that is delivered in a megawatt of power. And of course they also think in terms of budgets because money is, in fact, what makes the world go around.

We like feeds and speeds and slots and watts as much as anyone, but we like money. Because money is how you keep score. So after two years of not seeing the server trackers from either IDC or Gartner, when we happened to be searching for server forecasts on Google this week and we saw this page with lots of juicy data spanning from 2022 through 2027 from IDC, we got out the trusty Excel spreadsheet and went to work.

There was some commentary on the server forecast, which is generally not made public but a summary was released in late September 2022, which we also saw randomly a few months later and wrote about, oddly enough just like now.

Here is the IDC chart, which shows worldwide server revenue data from 2022 and the forecast for 2023 through 2027:

I don’t like this chart because the way IDC does the second Y axis with the growth rate it looks visually like it is going to zero but it is actually a 4.3 percent growth rate. We redid the chart and added in the data from 2021 to the data, which we had from last year:

Here is the actual data in table form:

This particular dataset doesn’t just show server revenues and forecasts five years out, but breaks X86 and non-X86 servers from each other, and for the past decade that would have been pretty boring for a lot of people, given that two-thirds or so of the non-X86 iron is comprised of Power Systems and System z mainframe sales from IBM and the rest was a mix of other proprietary machines and Arm servers. But with the rise of Arm servers at the hyperscalers and cloud builders, that non-X86 part of the business is getting interesting, and will continue to do so as RISC-V machinery becomes more normal in the decade ahead. So it is not as retro a way of thinking about it as you might think.

We actually had IBM System z and Power revenue figures for 2020 from IDC as part of their server tracker, which was $4.98 billion, and that means Arm/Other comprised the remaining $3.87 billion in the non-X86 category. If you make some assumptions about the IBM products going forward, with an upgrade cycle in 2021 and 2022 for Power10 and z16 machines and another Power11 and z17 upgrade cycle in 2025 and a general decline in revenues as the amount of compute in a Power or z processor keeps growing faster than online transaction processing and other compute demands, you might get a gradual decline in IBM server hardware sales that decline from just shy of $5 billion in 2020 to maybe $3.5 billion in 2026 and maybe $3.3 billion in 2027. If you do that, and use the baseline IDC data, then the Arm/Other part of the non-X86 business will grow at a very healthy clip, again depending on the binge-digest cycles of the hyperscalers and cloud builders, who nonetheless have a baseline level of consumption that is unavoidable, then Arm and RISC-V servers – and we think mostly Arm servers even way out there – will be in the range of $20 billion a year. That is about a 10 percent revenue share for Arm servers, which is not the same thing as the 20 percent or so Arm server shipment share we were talking about back in January when looking at some Gartner and Wells Fargo data that went out to 2026.

With so many hyperscalers and cloud builders working on custom Arm server CPUs and custom AI coprocessors, the options are wide open and the pressure is there to not just use X86 server CPUs and Nvidia GPUs for AI and other computationally intensive workloads.

Speaking of which, what we really want to know is how sales of AI servers – mostly for training but also for inference – is distinct from the rest of the server acquisitions. And we also want to have some sense of what inflation, which was let loose by server makers last year and into this year, is having on boosting revenues. GPU inflation is a big part of that as too much demand is chasing too little supply.

IDC had this to say in its report: “Direct impact of inflation on servers was felt more strongly each subsequent quarter during 2022, with year over year ASP growth rates escalating to 29 percent year over year in the second quarter of 2023, while unit growth, which had been in the teens through most of 2022, dropped to a meager 1.4 percent in 2022 Q4, declined year over year in 2023 Q1 by 10 percent, and now by 19.9 percent in 2023 Q2.”

Those are very big shipment declines against pretty high ASP growth driven by very expensive AI training and inference nodes with four or eight GPUs each and costing hundreds of thousands of dollars each. The AI and non-AI servers really need to be separated from each other because these are very different parts of the market. So we took a stab at it based on the top-line server revenues from IDC from 2020 through 2027, like this:

We realize there is some guesswork in here, but we think this is the shape of things to come and things that are happening as well as things that have happened in recent years.

The upshot is that unless something happens to slow down the growth in AI models and unless AI training and inference compute gets a lot cheaper, we think there is a non-zero chance that AI compute will comprise around half of server revenues by 2026 or 2027.

That model assumes a modest, GDP-like growth rate in non-AI server revenues each year after a pretty steep decline of 11.2 percent that started in 2022 and that will improve a bit to only a 5 percent decline in 2023. It also assumes a pretty staggering leap of nearly 5X in revenues for AI servers between 2022 and 2023 and then pretty healthy, steady state growth in the range of 20 percent in 2024 and slowing down to 15 percent in 2027. We didn’t force that growth for AI servers, but rather assumed modest growth/consumption cycles like we have seen in the past and then everything else left over was for AI servers.

So this is the explosion happening, right now, and as Nvidia GPU supply goes up and prices come down as well as other brands of GPUs and other kinds of accelerators enter the market and get traction at volume, everything will level out and normalize a bit. Perhaps. And at a whole new level.

There is a question as to how much supply of AI serving the world will need, and we admit predicting that out four or five years from now is very difficult indeed. If AI accelerators remain in short supply and prices stay high, then revenue will stay high. If volumes double or triple, prices will be cut by half or two thirds and revenues will be consistent nonetheless. Perhaps. We are tossing this idea out there for comment.

Obviously, not one wants to make it up in volume when it comes to gross margins. But intense competition – such as that which Nvidia has brought upon itself by its own success – has a nasty habit of forcing companies to do just that.

It’s funny. It took a decade and a half – from 1985 to 2000 – for RISC/Unix machines and the advent of Internet technologies as well as aggressive replacements of mainframes and proprietary minicomputers to reach 45 percent share of server revenues. And it may take the same decade and a half – from 2010 through 2025 or 2011 through 2026, however you want to call it – for AI servers to comprise around 45 percent of worldwide server revenues and for the AI workload to replace or augment just about every kind of application you can think of.