Getting With The Program On Software Defined Networks

If the profit margins are under pressure among the switch and router makers of the world, their chief financial officers can probably place a lot of the blame on Nick McKeown and his several partners throughout the years. And if McKeown is right about what is happening as the network software is increasingly disaggregated from the hardware – what is called software defined networking – they will either have to adapt or be relegated to the dustbins of history.

McKeown cut his teeth after university in the late 1980s at Hewlett Packard Labs in Bristol, England, one of the hotbeds of networking in the world, working on various things and as the dot-com boom was getting under way in the mid-1990s, McKeown was one of the architects of the GSR 12000 router from Cisco Systems, which helped build the backbone of the commercial Internet. Over the years, McKeown was co-founder at a number of startups while also being a professor of computer science and electrical engineering at Stanford University. The startups were sold to big names like PMC-Sierra and Cisco, and McKeown was one of the founding members of the OpenFlow project that came out of Stanford. OpenFlow was an offshoot of a PhD thesis by one of his students, Martin Casado, and soon after it was published in 2007, Casado, McKeown, and Scott Shenker took their software-defined networking ideas and founded Nicira. Server virtualization juggernaut VMware bought Nicira in 2012 for a whopping $1.26 billion to get virtual networking to underpin its virtual compute and fledgling virtual storage. A year later, not content to rest on his laurels, McKeown teamed up with Pat Bosshart and Martin Izzard to start up Barefoot Networks, which has created a fully programmable switch chip and the related, and yet independent, P4 programming language to code software-defined switch hardware.

The progression that McKeown has moved through in his career, shifting more functionality out of hardware and into software, is one that is reflective of the networking industry as a whole. One might even say it is predictive, reflecting from the future into the present.

The Next Platform had a chat with McKeown about the transformation under way in networking, and how the programmable revolution is under way and is not going to stop.

Timothy Prickett Morgan: You have seen a lot of change in the networking industry, and caused your fair share of change. How did we get here, and we are we going?

McKeown: My team at Stanford was the originator of what is called software-defined networking. Martin Casado and I were the co-founders of Nicira along with Scott Shenker, which really put forward this notion of moving the software from the hands of those who make network equipment to those who own and operate big networks, whether that was through enabling disaggregation or allowing them to write their own code or having it written for them by others. This represented quite a transformation in the industry.

But one of the things that always concerned me a little bit as we were walking down that path is that as those who own and operate big networks take more control they were going to run into this roadblock in that they are not really changing the behavior of the network unless they can change the way that packets are processed. At the end of the day, a network is just a way to transfer packets from one place to another and unless you are Google doing it one way and Amazon doing it another way, or Cisco doing it one way compared to Arista, to differentiate, then you are really not in control of the network.

So we started out about seven years ago with a collaboration between my team at Stanford and Texas Instruments to look at how you would build a domain specific processor – built for networking in a way that the GPU was for graphics and what the DSP was for signal processing and what the TPU is for machine learning – so that the user is writing programs in a high level language that is independent of the target, that are compiled down to run on that target, and that has no compromise in performance relative to a fixed function ASIC solution.

We were trying to figure out was this technically possible, and when it might happen, and what would it take in order for it to happen. When we attempted to do this in the past, it failed because those programmable devices – network processing units, or NPUs – were always 10X to 100X slower than the fixed function devices.

As you are well aware, there was a tipping point in the graphics industry when it was shown you could build a programmable device that was fast enough at a low enough power at a low enough price and allowed all of the functionality to move to software. So graphics became about software, not about hardware. We speculated on what it might take to do this in networking, and for the last few years we have been trying to bring that to reality.

The result is the Tofino switch chip, which at 6.5 Tb/sec is the fastest switch in the world, and it is programmed by users, not by us, using a compiler that we provide. This is the first time that there is a domain-specific processor for networking that has the same power, performance, and cost as the fixed function network processor. I think of this is a real tipping point.

TPM: I get it that the network, or rather the hierarchies of networks, is the important bit, and increasingly so in systems. And it is increasingly becoming the most costly component, too, or a clustered system. At least until pricing for main memory and flash went nuts this year.

McKeown: The interesting thing here is not so much that there is a chip or Barefoot Networks is a semiconductor company or anything like that. The bigger trend is much bigger than Barefoot, and that is that as all of the definition of the infrastructure moves up into software – we have all heard the mantra that software is eating the world as espoused by Marc Andreesen – that is actually the big story. The fact that it gets delivered on a Barefoot chip or another one that is equally programmable is the important thing. We love our Tofino chip, but the point is that this transformation is actually irreversible at this point. I have been publicly saying that, within five years, we will look back and scratch our heads, wondering why networking was defined by a fixed function chip where the features were figured out by chip designers. Was that really how it used to be?

This is the same as happened with signal processing and graphics, and we will be wondering why it took so long. One of the reasons is due to technology. The technology to build these programmable chips is now such that there is no difference in terms of that power, performance, and cost, and you can take that Tofino chip, which is as programmable for networking as a GPU is for graphics and parallel compute, and lay it over the Broadcom “Tomahawk 2,” which has almost the same capacity at 6.4 Tb/sec, and they have the same power and the same area, and because they have the same area they have the same cost. We are not doing any magic – we think we are clever but we are not doing magic – it is just that we are at this point with the technology that the programmability comes for free,

In order to make this happen in an open way and on a level playing field, as happens in other areas, is why we have been driving this P4 programming language. There are 70 companies now that are on board as members of, which is independent of Barefoot Networks and which is run as a non-profit company. It has experts from Cornell University and Princeton University, and folks from Google and Intel, who have driven this language and the language is totally independent from Barefoot in the sense that there is nothing that is tied to the Tofino target. It is being used by Xilinx for programming FPGAs, by Netronome for programming NICs, by VMware for programming the forwarding and kernel behavior in Open vSwitch, and by others who are not public about the what they are doing. P4 is for forwarding what the original C language was for programming computer systems. It is target neutral, but it is designed in a way that it exposes just enough hardware that you can write bleedingly fast programs.

One of the cool things about the relationship between the P4 language and Tofino chip is that if the program compiles and it fits on the chip, it will run at line rate. There is no mucking around with the microcode to try to make sure the paths are right. If it fits, it runs at line rate.

We have customers that have been using P4 since the end of 2015, and they write the programs and the very cool thing is, they don’t share them with us. We think this is fantastic, and this is what we set out to do. It means that they get it. The intellectual property of the features and the protocols and the new things they are doing to differentiate themselves from their competitors, they keep to themselves. This has never happened before in networking.

TPM: Or they could decide to share what they create and totally screw up the entire datacenter networking industry.


McKeown: True, some of them will, but that then becomes under their control and it empowers them. This is handing them the keys, and that that been my mantra for the past ten years. Those who own and operate the networks should hold the keys and should drive what happens.

We actually had an internal celebration back in 2016 when an early adopter customer came to Barefoot and told us that they had a bit of a problem and they wanted some help with debugging some P4 code, but they would not give us the code because they had created features in the software that they believed gave them such differentiation. We are not just providing programmability, but the ability to differentiate.

TPM: This is an interesting phenomenon. We talk to a lot of networking vendors, and we have seen how Broadcom, Mellanox Technologies, and others have added programmability to their switches. They are not, however, using P4. While their own methods might work well, and just as well as P4, it seems odd to me that they don’t support P4 directly given its backing. Why is this? It is like trying to fight Java with Smalltalk, which was arguably a better object oriented programming language and runtime. Now that I think on it, Java was not entirely open either because Sun Microsystems and then Oracle was afraid to let go and the community had limited input and the licensing was draconian. So maybe that is not a great example. . . .

Linux is a better example. It is like trying to fight Linux with Solaris or AIX or HP-UX. Open APIs that are compatible across many different Unixes was a great thing, but it is not the same thing as running the same open source code across any CPU, as is the case with Linux. There was no way that Linux was not going to keep getting broader and deeper, and therefore it was unstoppable.

McKeown: Absolutely. This is a classic innovator’s dilemma, which is when you have a wonderfully profitable product and control of the market, and when you are making money hand over fist, you don’t want to see that money being undercut. And so when you are shown right to your face an alternative that is open, you still can’t see it because you don’t want to see it. To their credit, we are talking about people who are super-smart and who know our industry really well and that have a fantastic execution record, they will see it and they will come around.

If we put ourselves in their shoes for a moment, they really do have a dilemma. It is slightly different from the innovator’s dilemma in that are past that and they see this is an almost unstoppable. P4 is an open system, and Google is very public about their commitment to it, not only as the lead investor in our last round of funding, but also for this P4 runtime, which is a way of programming their networking infrastructure. They have been very open about it, and I think they are going to be more and more open about it.

If you are in the shoes of the legacy, fixed function semiconductor companies, they are now at a point where they see that Barefoot has gained enormous mindshare in the industry, with P4 being the common, open programming language. So they have to choose: Do they get behind it and make it stronger, or do they fragment it and try to bring it down. The incumbents will usually try to bring it down, because by embracing the new technology they are giving credibility to the thing that is challenging and threatening them. And they can’t possibly afford to do that. The cynical approach, which is the way most companies operate, is to fragment it. And they will come out with something proprietary and half baked, and try to get people behind it. There is a bit of that going on, as you would expect. There is this thing called NPL that Broadcom has been promoting. Everybody is telling them: Just adopt P4 and don’t fragment it.

Now is the time to adopt P4, because that will accelerate our industry. Everything will move incredibly fast, and we will all benefit. Those who are the largest incumbents now are the ones who stand to gain the most from P4. They are in a dominant position and they must not be afraid of new technology.

Cisco used to have a name badge that said: “Don’t be afraid of new technology.” It was wonderful, because it is a classic mistake to be afraid of new technology and then get blindsided by it. Cisco has been the master of embracing new technologies, and it took them from routing to switching to systems. They even embraced SDN in the end.

TPM: It’s funny. Java is in fact an example of a server technology that the entire industry got behind almost immediately.

That worked out. I remember all of the server chip makers who had their own operating systems lining up behind Intel’s 64-bit Itanium and that turned out to be a fiasco of sorts and did not pan out as intended. Next Gen I/O and System I/O merged to make the InfiniBand fabric, and that seemed like a wonderful idea, and everyone got behind it and it was never used as intended; it has bene relegated to a high-end, low-latency interconnect fabric for compute and storage clusters. Here we are talking about Gen Z and CCIX and OpenCAPI and EIMB and it is about the same thing all over again, maybe we will get a single coherent interconnect and protocol out of this mess someday.

How does P4 get from something that Barefoot and a bunch of academics and hyperscalers are excited about to the way that networking is done? When does everybody support P4? There are years between network ASIC development cycles, so this can’t happen instantaneously.

McKeown: That is a great question. These changes always take a while to happen because someone figures the new thing out and the incumbents are reluctant because it is challenging to the way they do things. In the switching ASIC business, there has unfortunately been a very successful lock-in strategy around closed, proprietary APIs. The upside is that this provides backwards and forwards compatibility of code. The downside is that it is locked down under very restrictive NDAs.

These things never work in the end, because they get complacent. And complacency means they get blindsided by a whole new technology. And in this case, the new technology is that you don’t need a closed API, and in fact, the API on a programmable switch is undefined and you can use an implementation of the old API if you want. You can use the API any way you want. So you are opened up, and instead of thinking of the way you use the chip being programmed from the bottom up by a datasheet, you look at it from the top down. We don’t have a datasheet for Tofino; we give them a pinout and a programming language. You don’t need any more.

TPM: The irony is that some of the switch ASIC vendors don’t even show you a datasheet or a block diagram anymore. If you can get them to tell you the SERDES count and the process node, you have won. . . .


McKeown: That’s right.

Our view of this is that if we are building something that is the CPU for networking, then we better operate the way the CPU companies do. You remember the CPU wars around RISC processors? They would proudly show you block diagrams, tell you how they are doing it, and tell you what the instruction set is and what languages to use to program it. We did the same thing. Even before we had chips in hand, we were showing block diagrams in that spirit. It should be open.

Just like the CPU does not have an API – it has an instruction set – we won’t have a single API, but you can make one to suit your needs, just like Microsoft created SAI to suit its needs or you can recreate an older, proprietary one that matches the way you want to work. Google can program a chip differently from Amazon or Alibaba or Tencent, and they can make it what they need. We work with all of them to help them figure out how to work with an existing API – that’s easy, we can do it – or create a new one.

The thing is this: As soon as it becomes a software mindset, it is amazing. These companies have way more software engineers than hardware engineers. As soon as this is all about software, their imaginations go to town, and they start thinking of new things they can put into the network – the classic thing is much more visibility and telemetry, which they have been asking for for years. And that has been a driving force behind this change. Then they start doing things that we have not really thought of, such as key/value stores  in the network, or DNS caches that run at multiple terabits per second, or load balancing in the network that you would traditionally run on hundreds and hundreds of servers gets replaced by one chip because it naturally belongs in that place.

It is very exciting to watch, and I love it when people do things with this platform that we would have never thought of. That is what moving the network to software is all about.

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

Be the first to comment

Leave a Reply

Your email address will not be published.


This site uses Akismet to reduce spam. Learn how your comment data is processed.