With cloud computing, HPC, and now AI driving enterprise computing and the technical challenges and cost connected to semiconductor design and manufacturing increasing, demand for chiplet architectures continues to increase.
Intel, AMD, and other system-on-a-chip (SoC) manufacturers for several years have been growing their chiplet capabilities, pulling together smaller, reusable chiplets in modular architectures that improve efficiency, flexibility, and customization. That said, these chiplet-based semiconductors use proprietary interconnects to tie them together.
The Universal Chiplet Interconnect Express (UCI-Express) Consortium – driven by a collection of semiconductor firms like Intel, AMD, Qualcomm, and Taiwan Semiconductor Manufacturing Company (TSMC) and hyperscalers that included Google Cloud, Meta, and Microsoft – launched in 2022. The goal is to create a standard interconnect that would enable chiplets from different vendors – created in different fabs with different functions and put into single, workable packages – to communicate, driving even more flexibility, efficiencies, and customization. They released the UCI-Express 1.0 spec around the same time.
Fast forward to this week, and the UCI-Express Consortium – now with more than 140 members – is rolling out version 3.0 of the open chiplet standard, with a range of enhancements around power efficiency and management and continued backward compatibility. But the headline is about performance, with the new spec providing support for 48 GT/s (Gigatransfers per second) and 64 GT/s data rates, doubling the 32 GT/s bandwidth delivered by UCI-Express 2.0, which was rolled out a year ago.
The significant boost in performance addresses what the consortium calls an “insatiable demand” for higher bandwidth, particularly in rapidly expanding fields like AI, HPC, and data analytics, all of which come with limited “shorelines,” which is the physical space for chip-to-chip connections.
The doubling data transfer rates apply to the UCI-Express-S (2D Standard Package) and UCI-Express-A (2.5D Advanced Package) designs. Applications like AI require increasingly higher throughput, all within a confined footprint.
“We are shoreline-constrained in a lot of the applications, and this is much more pronounced in the AI and HPC space, but others are not far behind,” Debendra Das Sharma, an Intel Senior Fellow and chair of the UCI-Express Consortium, told The Next Platform. “You need higher linear bandwidth density. In other words, you need to deliver more bandwidth on a given shoreline because the chip size is not going to change just because you happen to need more bandwidth. It might change for other reasons, but not for delivering chip-to-chip bandwidth. That’s the reason we are increasing the data rate from 32 Gb/sec to 48 Gb/sec and 64 Gb/sec.”
Nothing changed with the 3D design, according to Das Sharmas.
“It is still a very low frequency and the reason is that with the lower bumpage, we already have a very high bandwidth,” he said. “We have hundreds of terabytes-per-second-per-square-millimeter, more bandwidth than we know what to do with, so no need on the 3D side. That can be very power-efficient. In the 2D, 2.5 D, there is a demand for delivering higher bandwidth within a fixed shoreline.”
Key to boosting the data rates is keeping the 3.0 backward compatible with previous versions.
“This is a critical consideration, as it ensures that existing systems and infrastructure can seamlessly integrate with the new standard,” the consortium wrote in a white paper. “The specification preserves existing sideband, valid, track, data, training, and signaling protocols, providing a smooth transition for system designers and developers while ensuring interoperability with older chiplets designed to prior generations of the specification.”
It’s also important as the reach of the standard expands. Its use in datacenter, HPC, and AI systems is well-documented, but it also will be universal.
“There are other places,” Das Sharmas said. “Today, everything is chiplets. It doesn’t matter. If you look at your handheld devices or your PCs, those are all constructed as chiplets. UCIe goes across the board. Automotive is another one of the uses. We think of this as the equivalent of PCIe [which got its own speed bump in June]. It’s like PCIe as a board-level interconnect. It works for your handheld all the way to your datacenter. For example, with UCI-Express A, you might say that [would make sense for] the more higher-end chiplets, like AI. It would make use of that because my handheld ones don’t need this kind of bandwidth demand. For that reason, we have the 2D, right? You’ll see that it’s a continuum that we want. It is for the across the entire compute continuum.”
That continuum includes digital signal processors and applications like wireless infrastructure and radar systems, he said, adding that “of course, we want to be in AI and HPC and the datacenter and all of these things, but also, our goal has been for the other segments as well. That’s the way we have positioned ourselves. Those are just three swim lanes that you.”
There also are other enhancements in 3.0, including with runtime recalibration, which involves reusing initialization states to allow for power-efficient link tuning during operation, support for more flexible session initiation protocol (SIP) topologies with an extended sideband channel that stretches up to 100 millmeters, and support for continuous transmission protocols through mappings to allow for uninterrupted data flow in the spec’s Raw Mode for newer applications like connectivity between SoC and DSP chiplets.





