A system is more than its central processor, and perhaps at no time in history has this ever been true than right now. Except, perhaps, in the future spanning out beyond the next decade until CMOS technologies finally reach their limits. Looking ahead, all computing will be hybrid, using a mix of CPUs, GPUs, FPGAs, and other forms of ASICs that run or accelerate certain functions in applications.
Obviously, connecting these disparate, distinct, and discreet compute components together requires some kind of data interconnect – a bus, in the old lingo but also sometimes called a link or an interconnect – so they can share data, in many cases, offload work from the CPUs to the accelerators and consolidate answers back onto the CPUs when they are done.
IBM started opening up the bus on its Power8 and Power9 chips with the CAPI protocol riding atop PCI-Express, and then created its own “Bluelink” SERDES for running the NVLink and OpenCAPI protocols to link to Nvidia Tesla GPU accelerators and other types of accelerators or even flash storage, respectively. Nvidia added NVLink to gang up its own GPUs into a shared memory cluster of sorts, and Xilinx with a bunch of friends (notably AMD and the Arm collective) put forth the CCIX protocol has another protocol with memory-style operations to glue accelerators to processors and, in the case of several Arm server chips, glue CPUs to CPUs in NUMA fashion within a chassis. The Gen-Z memory-cebtric fabric is more about linking multiple nodes across racks, rows, and datacenters, but there is certainly some overlap here as well in the way it can be implemented to link elements within a single system.
Over the past several years, as these protocols all emerged and were specified by their promoters, we have been wondering if the Bus Wars from days gone by – there were several of them back in the late 1980s and early 1990s when each system maker controlled its own system bus and several alternatives to linking compute complexes to peripherals, including ISA, MCA, EISA, VLB, and PCI buses and InfiniBand, as originally intended as a switched bus fabric, as well as PCI-X and PCI-Express emerged from the carnage.
There was a lot of fighting back in those days, and Jim Pappas, director of technology initiatives at Intel, remembers them all because he was in the trenches, fighting. If there had already been a universal interconnect for linking two CPUs together, none of this would have been necessary, but each processor has its own NUMA interconnect scheme and there is no changing that now, although we will point out that there is a universe where CXL could have, technically speaking, become a standard that vendors all implemented in future chips for both CPU NUMA and accelerator interconnects.
This seems unlikely to happen, but as we have pointed out in detailing the Compute Express Link, or CXL, protocol that Intel has put together to link processors to accelerators, there is definitely a rapidly evolving consensus with regard to processor-to-accelerator links. We talked about this quite a bit during The Next I/O Platform event we held last September in San Jose, when all of the key people behind these protocols were on stage with us and, in fact, had just that week formed an independent consortium and expanded its board of directors to include not only chairman Pappas, but Barry McAuliffe of Hewlett Packard Enterprise as president and Kurtis Bowman of Dell as secretary. Other notable board members include Nathan Kalyanasundharam of AMD, Steve Fields of IBM, and Gaurav Singh of Xilinx, who head up their respective Infinity Fabric, OpenCAPI, and CCIX initiatives, and Dong Wei of Arm Holdings is also present among chip designers, as is Alex Umansky at Huawei Technology. Facebook, Alibaba, Microsoft, and Google are also present, and what we have heard through the grapevine is that these hyperscalers abd cloud builders have been leaning on Intel pretty heavily to provide something akin to CCIX and OpenCAPI and to open it up so the entire industry would get behind it – and relatively quickly at that.
Now, Pappas tells The Next Platform, a total of 96 companies are now members, and this includes some pretty important additions such as Nvidia, Cisco Systems, Fujitsu, Inspur, Lenovo, Marvell, Supermicro, Wistron, Jabil, H3C, and Broadcom. Those are key OEMs and ODMs plus compute engine makers Marvell, Nvidia, and Fujitsu.
“People were really expecting a reprise of the Bus Wars, and they were not expecting singing around the campfire,” says Pappas. “But this has come together very well, and we don’t need all of these other initiatives to fail for CXL to succeed. This is about getting the ecosystem together to make CXL grow.”
The CXL 1.1 specification has been available since July last year, and it is for directly attached devices running over the PCI-Express 5.0 bus that is not expected to come out in processors until either late this year or early next year. The PCI-Express 5.0 protocol was just finalized in early 2019 and the PCI-Express 6.0 specification is moving through its subreleases to its 1.0 release for a ratification maybe in early 2021. While CXL is on an annual cadence, more or less, it seems likely that it will eventually slide into phase with the PCI-Express roadmap, which itself is trying to get into a reasonable and steady cadence. At seven years in the field, PCI-Express 3.0 was the top bus speed for far too long, and now that systems are going hybrid, the PCI-Express bus and all of these protocols really matter for performance. In any event, both the target and host sides of the CXL 1.1 interface have been published and companies are building to that specification now, according to Pappas. And the 2.0 specification will come out in the second quarter of 2020, and the expectation is to have a 3.0 specification in discussion soon by all the new consortium members.
The important things as far as industry adoption and innovation is concerned is that CXL rides on top of PCI-Express and that the PCI-Express roadmap is back on a proper iterative cycle after the long delay in bringing PCI-Express 4.0 into the field. Being based on PCI-Express means that system makers will have more flexibility so long as processor makers keep goosing the PCI-Express controllers they embed on their chips, or more likely, put into the I/O hubs of multichip modules that will comprise the processor socket of the future. There are coherent and non-coherent ways to use CXL as well, and this also provides flexibility because sometimes cache coherency is overkill for the job. That’s why Intel intentionally created an asymmetric coherent protocol for CXL, but does not require that it be used.
“Some customers won’t need any kind of coherent interface,” explains Pappas. “Maybe they are building cold stage devices and all that they want is as many PCI-Express lanes as they can get to attach as many SSDs as they can.”
The interesting bit to watch is how CXL and Gen-Z, which is a very different kind of interconnect beast, will interplay in system designs. The way that Pappas sees it, a CXL port, which will support a kind of memory semantics as does OpenCAPI and CCIX, potentially gives Gen-Z fabrics a universal mounting point inside of systems.