Prying The Lid Off Black Box Switch SDKs
February 1, 2018 Timothy Prickett Morgan
It would be hard to find a business that has been more proprietary, insular, and secretive than the networking industry, and for good reasons. The sealed boxes that switch vendors sell, and that are the very backbone of the Internet, have been wickedly profitable – and in a way that neither servers nor storage have been.
There are so many control points in the networking stack that it is no wonder the hyperscalers and cloud builders have been leaning so heavily on switch ASIC vendors to open up their entire stack. The only reason they don’t build their own switch chips is that they can get the same results, as a group, by putting the pressure on incumbents, like Cisco Systems and Broadcom, and fostering upstarts, like Mellanox Technologies, Barefoot Networks, and Innovium.
Over the past decade, with the rise of the hyperscalers and cloud builders, this has been changing, with companies like Facebook calling for busting up switches into disaggregated hardware and software parts so they can not only acquire these components and mix and match them as they see fit, but equally importantly so they can understand their full stack and therefore provide a higher quality of service to their customers. Because these companies drive so much volume, and at the bleeding edge, there is fierce competition to win the business, despite the fact that we suspect when Google, Amazon, Facebook, Microsoft, Baidu, Alibaba, Tencent, and China Mobile buy the components that go into their switches, there is not much profit left over. (That is certainly the case in servers and storage.) The reason is that a design win from one or more of the Super 8 is exactly like winning a competitive bid to build a next-generation supercomputer from one of the national labs. The big HPC centers influence computing, storage, and networking technology, and by setting the pace they get the best price and the earliest access, and so do the hyperscalers and cloud builders. This is ultimately to everyone else’s benefit when these technologies trickle down, but you can see who always has the edge.
It is with this backdrop that we ponder the forces that must have been at work and that will be unleashed on the networking business as Broadcom, the dominant maker of ASICs for datacenter switches for more than a decade now, is open sourcing the software development kit for its “Tomahawk” family of switches.
To be fair, and Eli Karpilovski, director of product marketing for switch software, SDN, and cloud solutions at Broadcom, is perfectly honest with The Next Platform about this, all of the major switch ASIC vendors provide the source code to their SDKs to the vendors of switches as well as to their biggest customers already, but that is a very different thing to making the SDK open source and therefore open to change through a community of networking peers.
It is tough to let go. But this is what the industry wants, and now that Broadcom has done it, others will follow suit.
While this may not sound like a big deal, the switch SDK is the penultimate control point to fall, the last being actually opening sourcing the VHDL programs that describe how to etch the actual chips. We don’t think that HPC, hyperscale, cloud, or high-end enterprise customers want to design their own chips, but they certainly want the providers of their open source switch operating systems and in some cases their own software engineering teams to have full access to the SDK that in turn gives full access to the underlying hardware. The SDKs have been the most important black box, and now Broadcom is getting out in front of its peers and ripping the top off.
Or, more precisely, it has come up with an entirely new, yet backwards compatible, SDK that has an architecture that best fits the way companies – particularly the hyperscalers and cloud builders that represent somewhere north of 50 percent of the datacenter switching ports sold each year these days – want to use switch SDKs going forward. The difference is analogous to creating a Linux kernel and opening it up with the hope of building a community around it rather than just dropping a code base for an existing Solaris, HP-UX, or AIX variant of Unix.
The SDKLT that Broadcom has come up with takes a different approach from the previous SDK (which will still be supported) and has roots that extend back nearly two decades now.
“We believe that a new approach needs to be taken,” says Karpilovski. “Twenty years ago, we had very basic switches and the way to program it was through direct register programming. These were very basic switches in the entire industry at the time, and it was very proprietary in how you programmed the registers. The switches started to get more features, and the vendors took it upon themselves to develop a set of APIs with the expected network behavior, such as set VLAN. But the reality is that if you look at fixed, known functions today, you are looking at more devices and more functions and more protocols, and any time you add more, you are adding more APIs. Now, you are talking about having thousands of APIs in the SDK, and it is not a consistent way in how all APIs are handled.”
So Broadcom, with the input from the hyperscalers and cloud builders, decided to wipe the slate clean and take this logical table approach, where complex functions can be assembled from tens of APIs, very roughly analogous to the way RISC processors can emulate the complex functions of CISC processors by ganging up a bunch of less complex functions. And as with RISC machines, the logical table API approach allows for more work to get done by the switch ASIC than is possible using the old SDK. How much of a performance difference remains to be seen. But in early tests with Tomahawk chips, Broadcom is seeing a factor of 6X improvement in packet I/O performance.
The most important thing is that with SDKLT, there are no hidden mechanisms to access or change APIs that in turn control the switch. Moreover, the APIs are designed to support the common client/server model and allow access through remote procedure calls and full device provisioning and control through the kinds of access control that are used in server stacks, including command line interfaces as well as YAML or XML interfaces.
The initial SDK Logical Table, or SDKLT for short, is being open sourced under an Apache 2.0 license and is tuned specifically for Broadcom’s Tomahawk line, which is aimed at the hyperscalers and cloud builders and importantly includes the 100 Gb/sec Tomahawk-2 line from October 2016 and the just-announced 200 Gb/sec 400 Gb/sec Tomahawk-3 that will be shipping later this year.
SDKLT can, in theory, be ported to the “Trident” and “Jericho” switch ASICs that Broadcom sells, but thus far, Karpilovski says that the company is not committing to doing so. It will depend on whether or not enterprises who by and large use Trident chips and service providers who largely use the Jericho chips want to have such low level access. We think they will, and certainly the switch makers who create gear based on one or more of these ASICs from Broadcom will want a consistent SDK. They may even want to port the logical table approach of SDKLT to other switch chips. Time will tell.