What layer of software is ultimately going to be in control of the fabric of compute, storage, and networking that organizations all over the world have been gradually building? Will it be OpenStack, which even server virtualization juggernaut VMware has adopted? Or will it be the application schedulers like Mesos, which comes out of the AMPLab at the University of California at Berkeley, or Kubernetes, which is inspired by the Borg and Omega job schedulers from Google?
The answers to these questions are not as obvious as we might expect, but it is interesting to contemplate as OpenStack is gaining traction in the datacenter and Mesos and Kubernetes, while in their relative infancy, are drawing attention to the higher layers of the software stack. This shifts the conversation in the datacenter away from the virtualization and orchestration of infrastructure to the co-mingling of software on server nodes in the cluster using containers to both drive up utilization and to radically simplify the process of deploying and maintaining applications.
It would be natural enough to think that OpenStack would end up being the center of gravity for all control functions, at least for those parts of the datacenter where open source software is preferred. Microsoft and VMware have very large installed bases of virtualized servers and have their own stacks of management tools – Systems Center with Azure Pack plug-ins for Microsoft and vSphere and vCloud extensions for VMware – to create and manage clouds. And they are both working to adopt Docker container formats on top of this virtualized infrastructure to compete with the combination of Linux, Kubernetes (which is a scheduler and container management system open source by Google), and Docker containers.
When OpenStack was formed, software containers had been around for a long time but had faded for a time as virtual machines and the server consolidation they enabled took the datacenter by storm. When NASA and Rackspace hosting launched the OpenStack project five years ago, the lofty goal was to create a central controller for massive-scale clouds that could span over 1 million server nodes and up to 60 million virtual machines. This would have been a massive undertaking, and one that would have been overkill. While scale is an important issue today for clusters, many companies want bare metal machines as much as they want virtual machines, the former better suiting some workloads. Practical concerns from industry and end user companies have shifted the OpenStack project’s priorities, OpenStack has become a kind of center of gravity for orchestration of virtualized compute, storage, and networks – a much more difficult task in some ways than the original plan.
With the advent of software containers, the need for OpenStack has not necessarily abated, but the rise of the Mesos and Kubernetes schedulers and the idea of containerized application development and deployment has come to the forefront, and this naturally begs the question of which kind of controller will ultimately be in charge of the application clusters of the future.
“There are two camps that are openly fighting for world domination,” Boris Renski, co-founder and chief marketing officer at OpenStack distributor Mirantis, tells The Next Platform. “There is overlap between the things that OpenStack does and things that Mesos does, but it is fairly small. The Kubernetes folks and the Docker clique will make the claim that the problem that they are solving is ultimately the most important and hardest problem, and all of the other problems are supplementary and they will evolve their stacks over time to do things that OpenStack does. The OpenStack clique is claiming the exact opposite thing, and say that containers have been around for a long time and that container orchestration is technically an easier problem and that OpenStack is solving the orchestration of the physical infrastructure and the virtual machines and that we will just add in Kubernetes and add container orchestration into our stack.”
“When Apple moved to bare metal with Mesos, one of the big reasons why they did it was, first, they did not need the virtual machines and, second, they got a big performance improvement. The virtualization tax that we often talk about is very real and for Apple it was on the order of 30 percent.”
This integration between the two worlds is being done through an effort called Magnum, which we told you about back in May and which will provide hooks between OpenStack and Google’s Kubernetes, Docker Swarm, CoreOS Tectonic, and perhaps other container management systems as they arise. Magnum just started in November 2014 and its code is not ready for primetime yet, so those who want to integrate OpenStack with either Mesos (which allows Kubernetes to run atop it, to make the situation even more nested) or Kubernetes kind of run them side by side somewhat manually. They use OpenStack to provision bare metal or virtual machines and then expose these to their container management system. Exactly how the compute capacity is orchestrated is not yet clear.
Strictly speaking, if you want to run the Mesos cluster controller, with or without Kubernetes container management on top, you don’t need a cloud controller like OpenStack, says Matt Trifiro, senior vice president in charge of marketing at Mesosphere, the commercial entity behind the Apache Mesos project that offers commercial support for the controller.
“The reality in the enterprise is that they invest in the people and the technologies and they prove them out and they have investments and inertia in this stuff,” explains Trifiro. “We have a lot of customers who are using VMware or OpenStack, and those systems are very good at provisioning machines. There is a very realistic deployment scenario where customers will use VMware, or OpenStack, or some other tool to deploy machines that become part of a DCOS cluster. What tends to happen over time is that as they move the DCOS cluster into production, they remove the unnecessary layer underneath. We see that evolution and we don’t have an axe to grind with the virtualization providers, but that is a common evolution path because they are not strictly necessary.”
To make the case, Benjamin Hindman, one of the founders of the Mesos project at the AMPLab who, along with techies from Twitter and Airbnb helped turn it into a cluster controller with application frameworks and a sophisticated two-level scheduler, brings out the example of Apple, which is now using Mesos on the thousands of server nodes that back-end its Siri service.
“The Apple Siri team was running on VMware and ultimately moved to bare metal,” explains Hindman. “When Apple moved to bare metal with Mesos, one of the big reasons why they did it was, first, they did not need the virtual machines and, second, they got a big performance improvement. The virtualization tax that we often talk about is very real and for Apple it was on the order of 30 percent. Removing it meant Apple could run Siri jobs 30 percent faster, which is a really big deal.”
Mesosphere does not want to give the impression that there is some sort of contention between the Data Center Operating System, or DCOS as it calls its production-grade Mesos tools, and cloud controllers like OpenStack and VMware vCloud, even if, as the Apple case shows, virtualization for many workloads is not always necessary any more.
“We are very much collaborators with the OpenStack and VMware communities,” says Trifiro.” For instance, a lot of the management and security infrastructure that VMware provides is not something that today we are looking to provide. There are a lot of capabilities that are not part of our stack that an enterprise might want, and this is not just a legacy albatross around their neck – it is providing real business value. Virtualization will not help you better orchestrate containers, but for many enterprise customers, DCOS plus VMware to manage containers is 1+1=5.”
Hindman more or less concurs with this statement, but still seems to think that in the long run, virtualization may not be necessary. “We are complementary to virtualization, and we can run inside of virtual machines, but the evolution for a lot of folks who already have an infrastructure as a service cloud is to just run Mesos on bare metal. But if you want to run a Windows application on a Windows virtual machine atop a Linux instance, you can.”
This belief is probably a by-product of the relatively homogenous infrastructure that Airbnb and Twitter have compared to the enterprise, according to Renski.
“There are purists, like the founders of CoreOS and Mesosphere, who come from the webscale world where they have uniform clusters and then the biggest problem is managing the containers and applications,” he explains. “If you look at the enterprise, that world is naturally heterogeneous and solving for heterogeneity is a big, complicated puzzle. So in an enterprise stack, unlike a Twitter or Google stack, you will see some VMware vSphere and Microsoft Hyper-V, storage from multiple vendors, and so on and these customers want a single interface from which they can control bare metal and virtual servers. I don’t know which camp is going to win, I don’t know if enterprise is ever going to go the purist route and their stack will become completely monolithic.”
This seems very unlikely, given how long applications live in the enterprise datacenter. But it can happen in greenfield sites, among those who are building new clouds, and even those standing up new supercomputers.
Thus, says Renski, so long as enterprises are diversifying rather than homogenizing their infrastructure, OpenStack will have a place and Kubernetes and Mesosphere will be “extremely complementary” to OpenStack.
“OpenStack is about simplifying the management of datacenter heterogeneity,” Renski continues. “The value of OpenStack today is that hundreds of vendors have written and are actively maintaining thousands of drivers for physical infrastructure, and OpenStack exposes them as a fabric of APIs and allows for the orchestration of bare metal and virtual infrastructure, whether it is compute, storage, or network. Kubernetes and Mesos are the underlying components of what people typically refer to as a platform as a service. They are great at managing and scaling applications wrapped in containers; OpenStack takes care of datacenter heterogeneity, but has a poor story when it comes to application and container management. You put Kubernetes or Mesos on top of OpenStack, and you get technical nirvana.”
CoreOS has just inked a deal with Mirantis to collaborate on the integration of the Tectonic container framework from CoreOS and the OpenStack distribution from Mirantis, which arguably the last of the independent OpenStack suppliers after Cisco Systems snapped up Metacloud and Piston Cloud Computing in the past year. Alex Polvi, CEO at CoreOS, tells The Next Platform that the work to get Tectonic running atop OpenStack is really about meeting customers where they are, and OpenStack is one of a bunch of different virtualized infrastructure stacks that Tectonic can be deployed upon. But Polvi doesn’t mince words, either.
“Our view – and it is biased towards our products, which is why we are building them the way we are – is that we think CoreOS plus Kubernetes is a really good base layer for everything,” says Polvi. “There are ways to put Kubernetes on Mesos, and you can even put that on OpenStack, but we think the appropriate lowest level is CoreOS plus Kubernetes and you can put other applications on top of that – including Mesos, if you wanted to. I know Mesos did it the other way, but there is no technical reason you can’t do it this way. We believe that the way companies like Google and Facebook manage their infrastructure is where we will end up. If the people that built Google Borg are now building Kubernetes as the ideal version of what they built internally, then this is the one that will win out over time.”
The real fight, then, might be Mesos versus Kubernetes. As Polvi points out, Mesos has some features that Kubernetes does not yet have, but he thinks it is just a question of maturity.” Kubernetes is just a lot younger,” he says. “For the next year or two, there will be use cases where Mesos has an edge. But we believe that Kubernetes will become the Linux of the distributed datacenter. It will be just common threading that everything is built on, just like Linux is.”
It seems likely that enterprises will continue to use virtualization where they need to, perhaps with containers on top, and default back to containers on bare metal wherever they can. This will come down to money and performance, as the Apple Siri case shows. But even bare metal orchestration requires something like OpenStack in the mix. Whether Mesos will add its own, or borg parts of OpenStack itself, remains to be seen.
none of the above. AWS. people will tinker around for a few years until they discover *Ops* is the key to managing large numbers of servers…orchestration software is incidental and nearly irrelevant in comparison. soooo many people blogging in this space have never managed a large cluster on a live service 24x7x365. so obvious. all the bit players have one thing in common: a misplaced belief that open orchestration software puts them on par with AWS
Ops is hard. AWS knows this. They just have to wait until all the tinkerers throw in the towel…
HA! Yes, there is always that option, too. And Bezos & Co certainly are counting on it. But don’t forget that the Windows crowd will naturally gravitate towards Azure.