Anyone operating at network at scale almost has to, by definition, hack together their own network operating system. Having access to a software development kit, as the major proprietary and merchant switch and router chip makers offer to selected customers, certainly helps for hyperscalers and cloud builders to better understand and control their datacenter networks.
But this is not quite the same thing as having an operating system that can span multiple switch or router ASICs, or better still, support converged switching and routing on a collection of ASICs that can be competitively ground against each other.
This kind of competitive pressure is one that hyperscalers and cloud builders that have created their own network operating system – Facebook, Google, Amazon, and Microsoft have all done it – enjoy and that large enterprises and perhaps more than a few HPC centers would love to have as well. But thus far the open source network operating system efforts – including Cumulus Networks, OpenSwitch (originating at Hewlett Packard Enterprise), Open Network Linux (from Big Switch Networks), PicOS (from Pica8), OS10 (from Dell, and now including elements of OpenSwitch), MLNX-OS (from Mellanox Technologies), OcNOS (from IP Infusion), and Metaswitch (routing software from the company of the same name) – have made great strides towards creating an analog for switching to the dominance of Linux on servers. It is admittedly still a work in progress.
But Arrcus, which launched its closed source ArcOS as it dropped out of stealth last July, is bucking that open source trend and it explicitly doesn’t want to be the Linux of networking. Rather, it wants to be the Windows Server of networking, which it thinks is a better economic model for a startup than the open source code and enterprise support model preferred by Linux distributors and those creating applications and tools for Linux.
Not everyone will agree with that assessment, particularly with IBM shelling out $34 billion to acquire Red Hat two decades after the world’s largest open source software packager and supporter went public. But red hat is the exception, not the rule, when it comes to creating profitable businesses based on open source software. Various projects – and we are thinking specifically of those who created Hadoop data analytics stacks, but various databases and adjunct tools for clusters come to mind – sopped up tremendous amounts of venture backing, which we think it will be very hard to get back through profits, going public, an acquisition by a bigger entity, or some combination these approaches. We like the Linux model when it comes to development, but it is hard to make money at it when the largest customers can self-support and it takes a long time for these technologies to trickle down to the masses who don’t have the technical chops to self-support and have to pay for help.
“We would like to be to the networking industry what Windows Server is to compute,” Devesh Garg, co-founder and chief executive officer at Arrcus and a one-time executive at switch chip maker Broadcom, among other things as we discussed when Arrcus dropped out of stealth last year with ArcOS, its cross platform but closed source switch and routing network operating system. “That’s how we think about it because it speaks to this ubiquitous ability to be deployed while having an appropriate business model that allows us to continue to invest in the resources that innovate and bring value to the marketplace on an ongoing sustained basis.”
The Arrcus team knows what it is talking about. Keyur Patel, who is chief technology officer and who was a distinguished engineer at Cisco Systems for 14 years, and Derek Yeung, who is chief architect and who spent 25 years at Cisco; they are both co-founders along with Garg and they are among the world’s experts on various routing protocols, particularly the Border Gateway Protocol (BGP) that is favored by hyperscalers and cloud builders for their hybrid switch/routing gear.
“I can tell you, people don’t want to deploy open source in the core of their network,” Garg continues. “No one is going to deploy FRR, which is rebranded Quagga, at the core of their network. Keyur routinely gets calls from the open source FRR community trying to get help debugging their software.”
The quality of the software, says Garg, is more important than who has the ability to change it and redistribute it. So Arrcus started from scratch making a network stack that had 64-but memory addressing top to bottom and that all features and functions were multithreaded as well so they could not only run functions on merchant or custom switch/routing silicon, but also offload functions to either X86 or Arm adjunct processors.
Last July, when ArcOS was announced, Arrcus said that its network stack was ported to the “Trident-3” ASIC from Broadcom, which was announced in August 2017 and which can be used to create switches with 10 Gb/sec, 25 Gb/sec, and 100 Gb/sec ports, and the Broadcom “Jericho+” switch chips, which have deeper packet buffers and other features and which made their debut back in 2016. Rather than retrofitting ArcOS on the “Jericho-2” or “Tomahawk-2” ASICs, which have a pretty substantial installed base, Arrcus is hoping to catch the next upgrade wave and be concurrent on Broadcom’s “Tomahawk-3” ASIC, which was announced a year ago and which will be available in switches providing 200 Gb/sec or 400 Gb/sec ports later this year.
As we have previously reported, Arista Networks has adopted the Tomahawk-3 chip in its 400 Gb/sec Ethernet switches, and running its own Linux-based Extensible Operating System. Juniper Networks is going with its own Penta, Q5, and ExpressPlus ASICs for its 400 Gb/sec Ethernet gear running its JunOS operating system, and Cisco has tapped merchant chip maker Innovium for its Teralynx chips to create its line of Nexus 3400 and Nexus 9000 gear running its own NX-OS operating system. The interesting bit here is that Arrcus has worked with two ODMs – Canadian firm Celestica and Taiwanese firm Edge-Core (a division of Accton Technology) – to get its ArcOS ported to the impending devices using Broadcom’s Tomahawk-3 chips so they come to market with the Big Three. This is the first time, claims Garg, that whitebox switching with an agnostic (but not open source) network operating system will be available in the market with products from Cisco, Arista, and Juniper, who comprise the bulk of sales to enterprises, telecommunication companies, big service providers, and more than a few of the hyperscalers and big public cloud builders. (The ODMs have a pretty big chunk of the business among the latter two, and have for years.)
Among these customers, Broadcom still has somewhere north of 90 percent of the switch shipments in large datacenters, and hence Arrcus is focusing there. But as we pointed out last summer, the ArcOS operating system can be ported to Tofino chips from Barefoot Networks, Teralynx chips from Innovium, Spectrum chips from Mellanox, or Aries chips from Nephos (a spinoff from MediaTek), which are the switch chip upstarts that count these days – or even FPGAs from Intel and Xilinx, for that matter, which can also emulate networking devices.
“We have created something that is production worthy, proven, and has the appropriate research and development investment focus behind it to build a sustainable, long lasting network stack,” says Garg. “And the way we are going to get there is similar to the way Microsoft got there with Windows. So we are taking the predominant architectures today – X86 and Broadcom chips – and we are giving optionality value to as many of our customers based on as many different ODMs as possible. And we have designed the architecture in such a way that we can lay in Barefoot when we start getting end customers saying to us they want ArcOS ported to the Tofino chipset because they are deploying it in volume. Then we can make a rational business case to put the dollars and the energy into that, and we can do the same for Mellanox or Innovium or Nephos or for even a custom ASIC. It could be an FPGA. We are completely hardware agnostic and we built it that way from day one so you have that flexibility.”
Given this, the questions now are as follows: Is ArcOS as good as or better than these proprietary network operating systems, and will it be more widely available than the open source alternatives outlined above? The race is on to prove it is as good or better. Assuming it can meet or beat them, then the server market is probably a good guide, with room for both a Linux-based network operating system and a proprietary and different one, based on its own kernel, like ArcOS.
What we do know for sure is that the crowbar of disaggregation in networking will continue, from the proprietary days represented by Cisco and Juniper, to the shift to merchant silicon and a Linux-based, extensible OS represented by Arista, to hardware-agnostic network operating systems that can run across a variety of silicon and give enterprises the kind of hardware flexibility that only the hyperscalers and cloud builders that write their own NOS can have today. There is room for two approaches, and given how markets don’t like just one choice, there will probably be at least two choices – but probably not more – in the long run.
ArcOS looks like solid work. I’m very glad to see a team build a clean-sheet OS from scratch. We need more OSes, especially secure ones.
What would make ArcOS even more attractive is a third-party security audit of the code.
“..clean sheet” … interestingly, they claim to be Linux based. https://www.arrcus.com/arcos/#benefits they say “Arrcus has taken a revolutionary approach to architect and deliver an independent, Linux-based network operating system, ArcOS”