Selling hardware into the modern datacenter is no easy feat, particularly when the needs of hyperscalers and enterprises have long since diverged.
Hyperscalers and traditional enterprises some overlapping demands, but lean more heavily on some aspects of a system than others compared to enterprises. Enterprises want a certain amount of bandwidth, but value longevity and consistency with prior systems. Hyperscalers need the cost of bandwidth to always go down because their bandwidth needs are always going up at a steep rate, and they need power consumption to also go down for every bit sent over the network. There are fewer hyperscalers but they buy as much gear as the significantly based of enterprises.
“Hyperscalers are looking to build extremely dense, extremely high-scale AI and ML networks,” Gurudatt Shenoy, vice president of product management the Mass-scale Infrastructure Routing Group at for Cisco Systems, tells The Next Platform. “Then pretty much the rest of the base includes the full swath of datacenters. There are enterprises, telcos, media providers, and they’re setting up datacenters for all these different applications; streaming that’s increasingly going to richer formats, IoT, edge, compute, all of that. In a nutshell, the portfolio that we are bringing is intended to address this wide swath of our datacenter buildouts.”
At the center for both hyperscalers and enterprises is the need for greater energy efficiency. Hyperscalers and cloud providers – think Google, Facebook, Microsoft, and Amazon, to name a few – when planning their datacenters tend to begin with the power budget they have and then work back from there, figuring out how much an individual device should consume, Shenoy says.
Foundational to what Cisco is doing with its datacenter networking gear is its three-year-old programmable Silicon One converged switch and router ASICs, which can be optimized for particular workloads. Cisco has deployed the Silicon One chips throughout its Catalyst and Nexus families of switches and routers and now is leveraging its capabilities for switches that were introduced at this week’s OCP Global Summit.
Cisco unveiled two high-end 800 Gb/sec routers, the Nexus 9232E and Cisco 8111. Both devices come in a 1U chassis and are powered by the 7 nanometer Silicon One G100 chip. They can support 32 ports at 800 GB/sec and can be broken down to do 64 ports at 400 Gb/sec or 256 ports at 100 Gb/sec, with even higher radix delivered in tighter spaces by using breakout cables. The Silicon One G100 has up to 25.6 Tb/sec of aggregate bandwidth.
We detailed the G100 switch ASIC back in March 2021, and explained how Cisco has one modular architecture covering both switching and routing – something that the hyperscalers and cloud builders have insisted upon as Cisco has re-entered the competition with Broadcom among these customers. Incidentally, even Broadcom has different Ethernet ASICs for different jobs, although the hyperscalers and cloud builders might push it to converge as well at some point. Broadcom has “Trident” ASICs for enterprises that need maximum protocol support, “Tomahawk” for hyperscalers focused on bandwidth and cost and needing less coverage on protocols, and “Jericho” with deep packet buffers aimed at routing and some switching.
Here are the new 800 Gb/sec Nexus switches from Cisco:
These new switches can deliver a 77 percent reduction in power and 83 percent reduction in space necessary compared to alternatives, according to Cisco.
“It’s a singularity system,” Shenoy said. “What’s different about our 25.6T versus 25.6T from others who have announced is the 100 Gb/sec SerDes. We are the first ones shipping a 100 Gb/sec SerDes on a chip does 25.6T and offers native 100 Gb/sec SerDes. That’s a differentiator, and once we start talking about 100 Gb/sec SerDes, it means a lot of good things. It means you have a lot more power efficiency. It means you can do 800 Gb/sec optics when the standards emerge.”
The Silicon One G100 also comes with a broad array of capabilities aimed at the AI and machine learning workloads that are key to hyperscalers and cloud providers.
Cisco also is leaning into offering systems to both hyperscalers and enterprises and optimized to their needs. Both have identical hardware, but the Cisco 8111 is a disaggregated system aimed at organizations that want to run third-party network operating systems. The Nexus 9232E is a highly integrated switch that comes with Cisco’s Nexus OS and its Application Centric Architecture (ACI) management stack.
“The Cisco 8111 is for a very small set of customers. It’s really for the hyperscalers,” Shenoy said. “It’s the likes of Microsoft that want to put SONiC on it or the likes of Facebook that want to put FBOSS on it. It is a smattering of customers. They’re very high volume and they have deep pockets, but they’re small in number and they care about disaggregation. So we are building that portfolio to certain standards. Then the vast majority of the market – enterprises, telcos, everybody else – they want a fully integrated solution. That’s how we are looking at everything we’ll get to the market. But again, the fundamentals are the same: for differentiators you have 100 Gb/sec SerDes and efficient and sustainable platforms, but with these two different models.”
Less than ten years ago, as we were looking at starting The Next Platform, there was a rush by network vendors to create disaggregated, open devices that could run a range of network operating systems, and a very large number of proprietary NOSes as well as some new ones were open sourced. The idea was to smash the network appliance much as the server appliance and storage appliance were decades earlier. This disaggregated networking is giving enterprises more flexibility in running switches that best fit their environments and workloads – something the hyperscalers and cloud builders demanded. Nonetheless, many enterprises have opted not to spend the money or resources to develop, run, or manage disaggregated systems, according to Thomas Scheibe, vice president of product management for cloud networking at Cisco.
“That seems to have completely shifted to interest by enterprises in getting hybrid cloud deployments going,” Scheibe tells The Next Platform. “How do we connect cloud provider instances to datacenters that these enterprise customers run? The focus has really shifted for the majority of those customers around the operations tools and how to really do software-defined networking in terms of stitching things together, basically really hybrid cloud on-ramps and off-ramps. That seems to be much more of his interest to them in terms of what value they’re trying to unlock going forward. It really depends on what type of customers you are.”
At the same time, Cisco introduced two high-density QSFP-DD800 form-factor optical transceivers that the company says doubles the bandwidth of previous offerings and enables organizations to leverage bandwidth on new 800 Gb/sec platforms.
They also offer high-density breakouts to 400 Gb/sec and 100 Gb/sec interfaces and are backward-compatible with QSFP transceivers. Shenoy said that right now there is not a standard for a single-flow transceiver at 800 Gb/sec, but they are being developed, adding that “we are actively engaged with the standards bodies there, so as that emerges, we will support that, too.”
Market research firm Communications Researchers expects the market for 800 Gb/sec transceivers and faster to reach $245 million in revenue worldwide by 2025 and will go up by more than 10X to $2.5 billion four years later, driven by such workloads as video, 5G, and IoT applications.
The Cisco transceivers, which will be available in the first quarter 2023, also will connect single-mode fiber links in the datacenter up to 2 kilometers.
Sign up to our Newsletter
Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Wow! 800 Gb/sec optical links over 2 km is quite impressive — and quite useful for that contemporary trend towards disaggregation of computational systems! I also think that the (often overlooked?) Content Addressable Memory (CAM, associative memory) in these chips’ Lookup Engines is similarly quite impressive (its broader use in high-performance symbolic computations would likely speed things up quite a bit).