If Amazon is going to make you pay for the custom AI advantage that it wants to build over rivals Google and Microsoft, then it needs to have the best models possible running on its homegrown accelerators. Just like Google already does with its own Gemini LLMs on its own TPUs and as Microsoft will eventually do with OpenAI’s GPT on its Maia accelerators.
And that is why another $4 billion investment in Anthropic last week by Amazon, the parent company of the Amazon Web Services cloud and the biggest beneficiary of the massive spending you are all doing on that cloud as you pay a premium for a premium IT product, was absolutely predictable. This fact as well as others are why we can expect to see bags of money round tripping between Amazon and Anthropic and Microsoft and OpenAI for the coming years – at least until Anthropic and OpenAI get rich enough to create their own AI accelerators and build their own infrastructure.
There are far-reaching and holistic strategies at work here between the model builders and the cloud builders. We explained part of this back in August in The Sugar Daddy Boomerang Effect: How AI Investments Puff Up The Clouds, when we went through the finances for Microsoft Azure and Amazon Web Services and talked about their parent companies’ $13 billion and $4 billion in investments, respectively, in OpenAI and in Anthropic. And we wondered out loud just how much of the increase in cloud computing spending in 2024 was due to the investments made in AI startups by Amazon and Microsoft. We think it is a big number, and we also think that given this, the number should be disclosed in their financials. Google parent Alphabet has invested $2.55 billion in Anthropic so far and is hedging its bets on large language models.
Wall Street is simply happy that someone is still investing in AI and that this boom town just keeps on a-growing. Eventually, all of this AI investment will have to provide a return on investment, and thus far people are generally hopeful that it will while at the same time are a little worried about the impact that AI will have on the knowledge economy.
We did a deep dive on Anthropic back in September 2023, and we are not going to go over the history all over again. At the time, AWS kicked in $1.25 billion into Anthropic, and the two agreed to start porting the Claude family of LLMs to the cloud builder’s homegrown Trainium AI training chips and Inferentia AI inference chips. We did a deep dive into the Trainium and Inferentia chips back in December 2023, and explained how AWS could undercut Nvidia GPUs for AI training and inference with its strategy. In March of this year, Amazon kicked in another $2.75 billion to Anthropic, and last week it ponied up another fresh $4 billion. At the time, using Nvidia “Hopper” H100 accelerators on the cloud, we calculated that $4 billion only covered the cost of training around three dozen 2 trillion-ish parameter LLMs in a 90 day timeframe.
With credible homegrown AI chips and enough volumes of them to get the unit cost down, AWS can provide better bang for the buck on AI clusters than it does using very pricey Nvidia GPUs. And with the Claude models tuned up for Trainium and Inferentia, Anthropic can be the biggest customer for them as it scales out its models to make them better and more accurate. AWS can keep iterating on hardware to match the needs of Anthropic software, creating a virtuous loop that can then be extended out to the Bedrock AI platform service operated by AWS, which came out of beta a year ago, which supports Claude as well as a slew of other LLMs, and which already has tens of thousands of customers paying for Claude on the cloud.
At some point, the revenue stream from Claude models running on Bedrock is big enough that it throws off enough profits that it can actually cover the costs of the AI training and inference needs of Amazon, the retailer and entertainment company. This crossover point probably occurred several years ago with generic datacenter infrastructure, although it is hard to calculate it precisely.
That is the genius of running a very large cloud while also at the same time being in another business – in the case of Microsoft, it is distributing software and tracking its use, and with Amazon, it is selling stuff online and getting it to us either over the roads or over the Internet. Interestingly, the IT infrastructure needed for Google’s search, video streaming, and advertising businesses is still so large and Google Cloud is still not profitable enough for this effect to take hold at the Chocolate Factory. But, we think, that day will inevitably come.
And, as we said, we think investments by the big clouds into the big LLM providers will continue apace even as the latter keep trying to raise money independently and keep their cloud sugar daddies as minority shareholders.
This is the trick, and it is probably also one of the reasons why Elon Musk decided not to build the 100,000 GPU “Colossus” machine at xAI in conjunction with Oracle Cloud and decided to take over a former Electrolux vacuum cleaner factory outside of Memphis, Tennessee and have Supermicro and Dell build the iron for Colossus. Musk knows full well, thanks to Tesla and SpaceX, that cloud AI is a lot more expensive than on premises AI. And in the long run, we would not be surprised to see the “Dojo” AI engines and related systems being created by Tesla used across the four companies controlled by Musk. (Including X, formerly known as Twitter, in addition to xAI, Tesla, and SpaceX.) Why not? Musk clearly wants to control the fates of these companies, and is rich enough to invest in building his own platform for those companies.
It would be funnier still to see Dojo spun out of Tesla and selling technology to the other Musk companies. Why not?
For OpenAI and Anthropic, their independence depends on being able to raise successively larger “up round” financing, which raises their valuation and hence dilutes the stakes of their cloud sugar daddies.
Poking around the Internet, we see Anthropic has a valuation of only $18 billion, which seems pretty small for AWS to still have a minority stake in the company. That’s 44.4 percent owned by Amazon by our math if you reckon investment against valuation. If you reckon the AWS investment against the $13.7 billion in total funding, then that is a 58.4 percent stake, which would not make AWS a minority shareholder. (And hence this can’t be the way the companies are calculating their stakes, we believe.)
OpenAI just raised $6.6 billion last month and has a valuation of $157 billion; Microsoft, Nvidia, SoftBank, and a bunch of VCs have kicked in a total of $21.9 billion in funding for OpenAI, and it is on track for maybe $3.7 billion in revenues this year, but also expected to lose $5 billion as well. If the Microsoft investments were booked as funding – and we are not suggesting that they are – it would have at least a 59.3 percent stake in OpenAI. This clearly has not happened because if it had, Microsoft would own OpenAI and behave accordingly. But somewhere north of $13 billion against the $157 billion valuation is no less than an 8.3 percent stake.
With the fundraising going on right now at xAI, the rumor mill says that Musk AI company is worth somewhere between $40 billion and $45 billion. xAI has raised $11.4 billion in four rounds of funding, including a $5 billion round last week. That funding round covers the cost of the Colossus machine and its datacenter, we think, based on back-of-the-envelope math. Just buying systems with 100,000 H100 GPUs would cost on the order of $4.7 billion, including networking.
Maybe Musk will build a cloud? It seems inevitable, once you see how the math works. Then all of you could use the xCloud (as we might call it) and subsidize the data processing needs of Tesla, SpaceX, xAI, and X.
At some point, we also think it is inevitable that Nvidia will build a cloud. The profits are just too good to avoid.
Maybe Jensen Huang, co-founder and chief executive officer of Nvidia, and Musk can do it together? Made you laugh. But stranger things have happened in this IT racket. That one probably will not. But imagine if they built competing clouds. . . .
Hard to believe Musk Cloud would be cheaper to rent on than Amazon or Google or whathaveyou. I can see why Musk would benefit from owning a GPU farm, it’s cheaper than renting. The same logic applies to anyone who does AI work on a modest scale though. Also, you do know that Musk didn’t actually wire up all the machines himself?
Yes, I do.
I know he has publicists like I have keys on my keyboard. Is it difficult to get candid pics of ppl like him?
I dunno.