What Do We Do When Compute And Memory Stop Getting Cheaper?

Timothy Prickett Morgan

1 year ago

The IT industry, like every other industry we suppose, is in a constant state of dealing with the next bottleneck. It is a perpetual game of Whac-a-Mole – a pessimist might say Sisyphusian at its core, an optimist probably tries not to think about it too hard and deal with the system problem at hand. But what do you do with the hammer when all of the moles pop their heads up at the same time?

Or, to be more literal, as the title above suggests, what do we do when processors and their SRAM as well as main memory stop getting cheaper as they have for many decades? How do system architects and those building and buying datacenter infrastructure not get depressed in this Post-Moore’s Law era?

In the past month, we keep seeing the same flattened curve again and again, and it is beginning to feel like a haunting or a curse. We have been aware of the slowing of Moore’s Law for years and the challenge it represents for system architecture on so many fronts – compute, memory, storage, networking, you name it – and we have been cautioning that transistors are going to start getting more expensive. But what is more striking, at least during late 2022 and early 2023, is that we have seen a series of stories showing that the cost of a unit of compute, fast SRAM memory, and slow DRAM memory is holding steady and not going down as it ought to.

The first time we saw this flattening curve pop up in recent weeks was a report by WikiChip Fuse from the International Electron Devices Meeting in December, which outlined the difficulties of shrinking SRAM cell area:

Now this chart doesn’t talk about costs explicitly, but with SRAM representing somewhere between 30 percent and 40 percent of the transistors on a typical compute engine, SRAM represents a big portion of the cost of a chip and we all know that the cost of transistors is going up with each successive generation of processes starting somewhere around the 7 nanometer node a few years ago. It remains to be seen if SRAM transistor density can be improved – the folks at WikiChip Fuse seemed to think that Taiwan Semiconductor Manufacturing Co would pull a rabbit out of the hat and get a density increase of better than the projected 5 percent with its 3 nanometer 3NE process. And if it can do that, we still think that the odds are that the cost of SRAM will go up, not down. And if it does go down, by some miracle, it won’t be by a lot.

Then today, this chart from Dan Ernst, formerly advanced technology architect at Cray and now part of the future architecture team at Microsoft Azure, put this chart out on his Twitter feed as a response to an article by Dylan Patel at SemiAnalysis. Here is the chart Ernst put out about DRAM pricing:

And here is the one Patel put out that prompted Ernst:

These two charts popped up a day after we had done our own analysis of the performance and price/performance of the Intel Xeon and Xeon SP server CPU line, in which the standard, mainstream parts have followed this pricing curve where the cost of a unit of work has flattened out:

And the chips with high core counts followed this much less aggressive price curve and a combination of opportunistic pricing and higher packaging costs and lower yields have forced Intel to still charge a premium for the Xeon SP processors with high core counts:

Depending on where you draw the lines and depending on chip architecture and implementation, Moore’s Law price scaling stopped somewhere around 2016 or so. It is hard to tell with Intel data because it had an effective monopoly on X86 compute until 2018 or 2019. But clearly, Intel is not trying to drive down the cost of compute as it did when it moved from 45 nanometer “Nehalem” Xeon E5500s in 2009 to the 10 nanometer “Sapphire Rapids” Xeon SPs here in 2023.

As we are fond of pointing out, Moore’s Law was not really about transistor shrinking so much as it was about reducing the cost of transistors so that more of them, running progressively faster, could be brought to bear to solve more complex and capacious computing problems. The economics drives the IT industry – density alone does not. Creating compute or memory density in and of itself doesn’t help all that much, but creating cheaper and more dense devices has worked wonders in the six decades of commercial computing.

What do we do when this has stopped, and may stop on all fronts? What happens when the cost per bit goes up on switch ASICs? We know that day is coming. What do we do when compute engines have to get hotter, bigger, and more expensive? Do we just keep getting more and more parallel and build bigger and more expensive systems?

It is the money that matters, and we want to talk about some smart people about this topic in an upcoming series of video roundtables that we will publish soon. So if you are interested and you know something, send me an email.