For several years, work has been underway to develop a standard interconnect that can address the increasing speeds in servers driven by the growing use of such accelerators as GPUs and field-programmable gate arrays (FPGAs) and the pressures put on memory by the massive amounts of data being generated and bottleneck between the CPUs and the memory.
Any time the IT industry wants a standard, you can always expect at least two, and this time around is no different. Today there is a cornucopia of emerging interconnects, some of them overlapping in purpose, some working side by side, to break that tight link between the CPU and its (very possessive) memory – and one that supports a wide range of storage media types. This includes such fabric technologies of CAPI, OpenCAPI, NVLink, and CCIX, all of which have high-profile champions within the tech industry, including IBM and Nvidia. Another of these efforts has been the work around Gen-Z, which is backed by a consortium launched in 2016 that now has more than 50 members, ranging from chip makers like AMD, IBM, Samsung, Broadcom, Arm, and Micron, systems makers like Hewlett Packard Enterprise, Lenovo, Cray and Dell EMC, and storage and memory vendors, including Seagate, Toshiba Memory, Western Digital, and SK Hynix.
These protocols are being developed in part to respond to the inability of the PCI-Express bus to keep pace with the data speeds that are being enabled by such technologies as accelerators and emerging storage and memory. Gen-Z holds the promise of creating a higher-throughput and lower-latency fabric that can drive faster connections between chips, accelerators, main memory and fast storage and enabling more powerful servers that are packing even more cores and accelerators and processing more data.
During a presentation at the SC17 conference in November, Michael Krause, vice president and Fellow Engineer at HPE and one of the creators of the Gen-Z Consortium, said an impetus for developing the protocol “was the fact that the compute and memory balance was getting out of whack. We’re getting to the point where even though we can add more compute to any system, we actually can’t feed it. So we want to be able to provide very high bandwidth and very low latency and restore that balance to our compute systems.”
All that will take time. The consortium for more than a year has released several drafts of the specification to an industry that has been anxious to get its hands on the protocol. The group has now done just that, making the Gen-Z Core Specification 1.0 publicly available on its website. Now tech companies like chip makers, server and storage OEMs and ODMs, and networking companies can begin developing products based on the standard. It’s a process that will take time. Chip makers and device makers will have to start supporting Gen-Z in offerings that they develop throughout the year, which means it most likely won’t be until 2019 that the industry begins to see a ramp up in the number of products that include Gen-Z controllers.
That said, the facts on the ground won’t change during that time, so demand for such fabric-based offerings will most likely only grow. The rise of such trends as the Internet of Things (IoT) and bring-your-own-device (BYOD) in the enterprise will is driving the rapid growth of data, particularly at the network edge, and while more compute, storage and memory capabilities are being pushed out to the edge, there is still a good amount of that data that needs to come back into the datacenter. As organizations push to be able to analyze that data in near real-time, the demand for faster speeds in transferring and processing the data is increasing.
We at The Next Platform have taken a deep look into Gen-Z, which essentially is a fabric that brings together bus technologies and the interconnect and enables chips to become memory-agnostic, allowing them to access multiple current memory technologies like DRAM, NAND and storage-class memory like Intel’s 3D XPoint and be able to access future memory technologies. A key is breaking the interlock between the memory and processor in part to ensure that innovations around the two didn’t have to happen in lockstep. Such a move gives systems and component makers greater flexibility in creating new designs. That will be important for vendors like HPE, which is building its massive system of the future – The Machine – which the company says is a memory-driven system (rather than one where compute is front and center) that is aimed at addressing the big data workloads of the future. Among the supercomputer’s attributes is 160TB of main memory the initial system holds, though the company says that will grow in future versions of the system. The memory of the prototype was DRAM, but HPE is looking to other offerings like memristors and 3D Xpoint, with a silicon photonics interconnect
Other OEMs also are putting a greater emphasis on memory in their designs, so having a protocol that decouples the memory from the processor should help speed up innovation around these efforts.
The first spec will support up to 56 gigatransfers per second (GT/s) speeds and lower memory latencies to less than 100 nanoseconds. Gen-Z is designed to scale from point-to-point topologies up through the switch and into rack-scale topologies, and includes built-in security like hardware-enforced isolation and packet authentication, with privacy coming to the architecture this year. Flexibility comes through not only support for multiple media types, but also component types, including X86, Power, and Arm chips architectures and the multiple accelerators that are being used in modern systems. It also can be used in single and rack-scale enclosures and eventually clusters.
Whether all this is enough to enable Gen-Z to rise above the other fabric technologies is unclear. Tech companies have multiple options to choose from, and all are backed by major industry players. However, the challenges related to data, memory, accelerators and speed continue to grow, so the drive to embrace innovation fabric technologies won’t slow.