And you thought toilet paper shortages were bad in the beginning of the coronavirus pandemic, or board and plywood prices are high and getting insane at the local hardware depot that you already spent too much money at. The laws of supply and demand – and the resiliency of the global economy – are going to be severely stress tested by the semiconductor shortage.
An eventual rise in chip costs is expected due to the slowdown in rate of shrinkage in chip transistor sizes – and therefore the proportional lowering in the cost of the transistors – that is happening as Moore’s Law runs out of gas, in large part due to the very small sizes of these transistors relative to the size of a copper or silicon atom but also driven by the increasing costs of extreme ultraviolet (EUV) lithography equipment and the number of steps it takes to etch a chip using EUV techniques.
To put it bluntly, chips were always going to get more expensive on their own, even if enough fabs could be built to ensure supply. That’s what the aging of Moore’s Law – the old man ain’t dead just quite yet – means. If the transistors can’t get smaller – they already can’t get faster thanks to the death of Dennard Scaling a decade ago – then the chips will have to get bigger and hotter eventually, and a larger chip has inherently higher costs because getting a known good part out of any given area decreases as the chips get larger. Yields are by necessity lower on bigger chips. Which is why you will see the industry eventually embrace chiplet architectures, first in 2D and then in 3D configurations. We have no choice.
But the coronavirus pandemic gave us some price hikes no one was yet expecting. First of all, the pandemic upset factory production at a number of fabs, which caused some shortages or at the very least longer lead times between orders and deliveries of the more than 1 trillion chips that are made and consumed each year.
In the case of the auto industry, which is being hit very hard right now, the wounds have been largely self-inflicted. Car sales plummeted at the beginning of the pandemic, and the car makers cut back on orders for chips that they use in vehicles for all kinds of microcontrollers and electronic devices – which tend to be made with much older 40 nanometer and 28 nanometer technologies (or their equivalents). When car sales skyrocketed as the pandemic started to wane, the auto makers lost their places in the foundry lines, which are now making other chips full tilt boogie, and now there are car chip shortages that are in causing auto production slowdowns. This, as you might expect, is also causing car prices to rise. Somewhere between 5 percent and 10 percent over prices a year ago as best as we can figure from reading a bunch of stuff on the Internet.
Even without the coronavirus pandemic, we were already living in a world with far fewer foundries and where each generation of lithography was getting more and more expensive. As demand has risen for the most advanced types of compute engines, supply has had trouble keeping up, and CPU and GPU and FPGA makers have had to walk a fine line between making too much of their chips and wasting money or making too few of their chips and leaving money on the table. The good news for them is this: When there is too much demand chasing supply, you can get some of that money back off the table by charging more than you might otherwise for a CPU or GPU or FPGA.
This is precisely what we think has been happening for years, and we think that it is going to get worse in the coming years. It is hard to measure inflation rates on advanced chips because the increased costs of making the chips and the extra cost that comes from being conservative on the chip manufacturing volumes is already built into the list prices, which cascade down to higher street prices. That’s why a high-end CPU, GPU, or FPGA costs more than $10,000 a pop already. It is not just the value of extra performance that justifies that higher cost, it is the additional value of the relative scarcity of that level or capacity of performance that a particular generation or SKUs within a generation. You could plot out generational price/performance curves for compute engines, for instance, and the bang for the buck might actually be flatlining or even rising, as was the case for the “Skylake” and “Cascade Lake” Xeon SP processors from Intel, for instance. Intel was doing that because it had a virtual monopoly on X86 server compute in the datacenter. It can no longer do that, of course, with AMD offering such compelling performance and price/performance with its Epyc processors.
We think chip price inflation above and beyond the level that is already built in, as illustrated above, is going to happen as too much demand chases too little supply. Hopefully it will not be as bad as what we saw with main memory and flash prices. A lot will depend on the overall demand at the advanced fabs – meaning mostly Taiwan Semiconductor Manufacturing Corp with a smattering from Samsung and GlobalFoundries and Intel doing its own 10 nanometer Xeon SP thing – and how the demand plan with the foundries maps to the actual demand from customers. Just as the auto manufacturers inflicted their own wounds, more than a few compute engine suppliers or network ASIC suppliers or storage controller suppliers or whatever are going to get their demand wrong and they won’t be able to negotiate cheap supply to fill in a gap. They will have to ride it out.
That’s precisely what happened to DRAM and flash memory starting in 2017. The DRAM and flash makers were seeing huge demand from makers of client devices, which use relatively small chunks of memory, that they could charge a premium per bit compared to the memory that they make for servers, and so they just didn’t make enough server memory. In fact, DRAM prices are back on the rise again. Samsung was anticipating a 10 percent to 20 percent increase in DRAM memory prices and a 30 percent to 35 percent increase in flash prices in 2021, which is still a big jump. But it is nothing like the doubling of DRAM prices we saw in 2018 and 2019, which radically impacted the cost of systems. If all of the DRAM and flash memory makers, which just rode down a crash, cut back on supply or even hold their capacity steady, prices will necessarily rise and here we go again. People will be deciding between compute capacity and memory capacity again. (You need both for the compute to work right.) It got ugly, with memory prices doubling over the course of a year, and the memory makers were a whole lot happier than the server makers, who booked more revenues but probably the same or less profit because they had to eat some of the costs to move product.
The thing that we all need to get used to is the idea that we will live in an inflationary environment until more capacity can be brought to bear in making chips based on old technologies as well as the most advanced ones, and that the situation will be made worse by the fact that more and more products have chips embedded in them and each product has an increasing number of chips, too, as their generations go buy. Expect to see a story where some obscure controller chip causes shortages in complex systems. It is going to happen. And keep happening.
Intel might be wishing it had left more of its 32 nanometer and 22 nanometer chip making equipment up and running, which would help it to ramp up Intel Foundry Services faster and therefore help it pay for 7 nanometer and 5 nanometer foundries. That would have been nice, but the fact is that chip demand is not going to abate, and Intel is going to be able to get market share just because TSMC is really the only game in town at the high end – Samsung is trying, we know, but until we see Power10 launch and others use Samsung as their compute engine foundry, we remain understandably skeptical.
In a normal inflationary environment, we increase supply to meet demand and prices drop or we find substitutes, for instance eating more chicken and less beef. An advanced foundry costs between $10 billion and $12 billion and takes two to three years to bring online. And no one is going to build sub-advanced capacity when they also have to build advanced capacity – well, excepting GlobalFoundries, which has based its business on the idea in the wake of pulling the plug on its own 7 nanometer lithography back in August 2018.
When you can’t quickly increase supply and there are no substitutes because everyone is already running at capacity and it is not trivial to move lithography masks from foundry to foundry while also maintaining yields, this is a recipe for higher prices. So, brace yourself, everyone. The price/performance curves are going to bend up instead of down for a while. Maybe for a long time. And maybe forever.
And lead times are getting crazy, too. Sometimes it takes a quarter for chips to be delivered, and sometimes it takes a year. Yes, a year. And since time is also money, as John Milton Keynes proved while sharing a whisky with Albert Einstein, you are going to pay, one way or the other.
Get some supply chain buffers into your contracts. Start playing the game like a hyperscaler.
This might cause accelerated EOL for some niche products and a huge mess for small and med sized companies that missed out on the opportunity to stock up or redesign. Luckily there are low cost FPGA with sufficient SerDes and IO capabilities like Lattice ECP5 that can act as substitutes for the number of PCIe1 and LPDDR1 bridging products that we might see going EOL soon. Interesting times ahead of us. Thanks for the insightful article.