CoreOS Hyperscales Linux By Making It Invisible
February 25, 2015 Timothy Prickett Morgan
Operating systems matter and they always will matter on the systems that run the applications of the world. It would just be a whole lot better if they were essentially invisible and were self-maintaining, especially at massive scale. That, in a nutshell, is the goal of the CoreOS distribution of Linux, and it looks like this upstart operating system is starting to get traction in the enterprise.
Having a customized Linux operating system that can run at scale is not a new idea. The largest systems at the major supercomputing centers of the world have heavily customized distributions of Linux, generally based on a variant of the operating systems from SUSE Linux or Red Hat that have been tweaked and tuned to run on specific interconnects and with special libraries to accelerate number-crunching jobs.
The largest hyperscale datacenter operators, like Google and Facebook, have their own homegrown Linuxes, too.Of course,, they have the resources–in terms of both capital and programmer skills–to maintain their own operating systems and also to tune them for their specific hardware and applications. The hyperscalers also have tools, also developed in house, to update and patch those Linux operating systems automatically, greatly reducing the management burden of running applications at scale.
CoreOS was founded with the idea of bringing the minimalist operating system embodied in Google’s ChromeOS for client computers and the application container system that Google developed internally for its massive server clusters together to create a simpler and more secure platform for running server workloads on distributed systems.
There are some misconceptions about what CoreOS is, and the first one is that it is based on Google’s ChromeOS. It is, in fact, its own build of a Linux operating system and it is most definitely made for server applications. While Polvi and Philips did use the build system employed by Google for ChromeOS, it is not in fact based on that variant of Linux that was created for client devices. CoreOS does, however, have a minimalist philosophy that is similar to ChromeOS, and it is inspired, in part, by the homegrown Linux and container system that Google long-since created to run its own internal workloads at massive scale.
The basic premise behind CoreOS is simple enough. Minimalist Linux operating systems like JEOS (short for Just Enough Operating System) were created years ago to support embedded–and usually virtualized–application software, creating what was in essence a virtual appliance. This JEOS implementation made it easier to patch a particular Linux operating system running inside of a virtual machine, but in the end, each Linux instance still had to be individually patched and updated.
CoreOS is similarly a Linux that has been streamlined to support server workloads. But the difference between CoreOS and those JEOS variants of SUSE, Red Hat, and Ubuntu is that every time you log into CoreOS, you are logging into an updated operating system. Another difference is that applications are certified to a container, not the operating system, and as long as the specifications and runtime for the containers do not change, you can change the underlying parts of the operating system without disturbing those applications or inadvertently adding incompatibilities.
“It is a continuous stream of tiny updates,” explains CoreOS co-founder Alex Polvi. “It is treating an operating system like a Web application with continuous deployment. It is not treating it like a Linux server operating system, where you do updates every X months or X years, but treating it like an OS as a service.”
Mashing Up Cloud And Linux
The CoreOS project was founded in May 2013 by Polvi and Brandon Philips and has quickly gained attention in part thanks to its early adoption of the Docker container as a central piece of its operating system. CoreOS also created a lot of buzz late last year when it announced App Container, a new container format, and Rocket, a new container management tool, that will compete against Docker as well as co-exist alongside of it on the CoreOS variant of Linux and perhaps on other Linux distributions as well.
To a certain extent, Red Hat’s adoption of Docker containers and its Project Atomic streamlined Linux for running those containers are a reaction to the growing popularity of CoreOS. Ditto for Canonical’s LXD Linux Container Daemon effort, which is an effort to bring some of the features of hypervisors to LXC Linux containers.
Polvi and Philips met at Oregon State University, where they were software developers in the Open Source Lab. After college, Philips worked as a kernel developer for SUSE Linux and then moved over to Rackspace Hosting to build infrastructure to support the developers at the cloud provider. Polvi worked for three years as a project manager on the Mozilla open source browser project, then founded a company called CloudKick (acquired by Rackspace in 2010 for $40 million) to manage virtual server infrastructure on clouds and that eventually ramped to over 1,500 customers with over 1 million virtual servers under management in two years. Polvi, the company’s CEO, ran the business unit around CloudKick at Rackspace for a while and then founded CoreOS with Philips, the company’s CTO.
“When you take a cloud guy and a Linux guy, you end up with a cloudy OS,” Polvi tells The Next Platform.
Polvi and Philips got some help on the CoreOS effort, which launched in May 2013. Michael Marineau, also from the Open Source Lab at Oregon State and importantly a former site reliability engineer at Google, is one of the co-founders of CoreOS. Greg Kroah-Hartman, who has been a software engineer at IBM and Novell and who is now the second-most important person in the Linux development community after Linux founder Linus Torvalds, is also an advisor on the CoreOS project. The first CoreOS release came out in August 2013, when the company came out of stealth mode, and the first CoreOS Managed Linux release was announced in July last year
CoreOS raised two rounds of seed funding, with the amounts undisclosed by in the range of millions of dollars, including money kicked in by Andreessen Horowitz, Sequoia Capital, and Fuel Capital, and in June last year the company raked in $8 million in Series A funding from Fuel and Sequoia with Kleiner Perkins also investing. The company now has 28 employees, drawn from the ranks of Google, Twitter, Amazon, and is growing fast. In August last year, CoreOS used some of the venture capital it has raised to acquire Quay.io to layer some container management tools on top of CoreOS; that software is now sold as a service, called Container Registry, and as an on-premise tool, called Enterprise Registry.
Oddly enough, making a cloudy OS was not the initial impetus for the CoreOS project. Security was.
“When Brandon and I were chatting, the opportunity that we saw that was a good social mission as well as a business opportunity was around security,” Polvi says. “Our goal was to fundamentally change the security paradigm of the Internet, and our observation is that the key to good security revolves around updates, and if you can make patching simple, you can make huge step functions in the improvement in security. Server infrastructure is notoriously fragile, and the mantra in most datacenters is get it running and don’t touch it. We thought if we could make a server resilient, then we could patch a server at any time without any user interaction whatsoever.”
So how do you make a server secure? Polvi says that you have to separate the applications from their server host operating system dependencies, and that naturally leads to putting applications inside of containers. So with CoreOS, containers are the only way to run applications. CoreOS started out as an early adopter of Docker containers, but has also created its own App Container format because, among other things, Polvi and Philips believe the security model for Docker containers is not robust enough. The CoreOS team also thinks that the aspects of a system for running and managing containers should be composable–meaning that they are independent even if they are integrated–and contend that Docker is trying to build a single, monolithic platform (those are Polvi’s words from the Rocket project launch) that is not just a container format, but a system for building and running containers and a system for launching and managing containers as they run on clusters. CoreOS did a lot of this work already.
Aimed At Customers Who Don’t Want To Care About OSes
The initial customers who are adopting CoreOS, whether they are startups with a handful of machines or big tech firms with tens of thousands of machines, are focused on production Web applications, as you might expect. And as Polvi puts it, the rise of mobile computing, with people constantly wanting to access applications relating to their jobs and their lives outside of work, makes such production Web workloads relevant to every company. Polvi says that the number of steady state machines using CoreOS that are using their update service (rather than a service that runs behind the corporate firewall) is on the order of tens of thousands of machines. As is common with many open source software, there is no phone home function in CoreOS so the company can do accurate counting. But Polvi adds that one big customer that he has heard about has put CoreOS on more than 90,000 machines in their clusters.
“Our technology can apply up and down the scale, from three servers to 3 million machines,” Polvi says. “Not that we have a customer with 3 million machines.” Not yet, anyway. If CoreOS is on a hockey stick curve, as it most likely is, then adoption is still on the blade touching the ice and it is only now moving up to the handle.
CoreOS runs on 64-bit X86 processors at the moment, but there is no reason why the software cannot be ported to ARM, Power, or other architectures should the customer demand arise. Polvi says that CoreOS is positioned to move quickly to ARM should the various chips from AMD, Applied Micro, Broadcom, and Cavium take off in the datacenter in 2015 and 2016, as many expect or hope they will.
The thing to remember is that CoreOS is different from the Linuxes that enterprises are used to.
“As you know, IT people hate changing anything, so if Red Hat is working well for them, then they should stick with Red Hat,” Polvi says with a laugh. “The people we get are the ones that don’t want to care about the operating system anymore. If we are successful, then customers won’t need to care what kernel version they are on and they can focus higher up the stack on their applications.”
The CoreOS stack is not just limited to this minimalist Linux that only offers Docker containers as its application runtime environment (and soon, App Containers compliments of the CoreOS development team when they are ready for production) and its companion CoreUpdate automatic patching of the Linux underbelly of CoreOS and Enterprise Registry container management software, which allows for containers to be hosted and shared across a cluster of CoreOS-based machines. Interestingly, the CoreUpdate dashboard uses Google’s own “Project Omaha” automatic updating software, which the search giant created to update Windows desktops and which it open sourced, and updates are applied to one of two partitions that underpin the Docker containers. After updates are done to the Linux plumbing, the container moves from the old to the new partition and the application is none the wiser. Most importantly, CoreUpdate can be used to update the applications running inside of the containers as well as the underlying Linux and Docker infrastructure.
One of the secret sauces in the CoreOS stack is called etcd and the other is called fleet. In Linux, the /etc directory contains all of the configuration information about the Linux system, and etcd is a distributed implementation that gathers all of the configuration information together from the CoreOS host machines and puts it in a centralized key-value store.
This vital data, which is used to keep track of the updates to CoreOS, is generally stored in a cluster with one master and four slaves that are in turn ringed by proxies that keep all of the CoreOS host machines in a cluster updated. It is designed so the etcd function is resilient in the face of crashes. The scheduling layer on top of etcd, which places containers and applications onto the CoreOS clusters, is called fleet, and it is only one of a number of schedulers that make use of etcd. Google’s own Kubernetes Docker container management tool runs atop etcd, as it turns out. The Mesos management tool, which mimics the bare metal and container management systems inside of Google, also makes use of etcd to store configuration data. The Cloud Foundry platform cloud also used etcd and more than 500 projects on GitHub are making use of it in one form or another, says Polvi.
CoreOS can be deployed directly on servers or remotely booted using PXE network booting; it has been certified by CoreOS to run on the infrastructure clouds from Amazon Web Services, DigitalOcean, Google, and Rackspace, and community support is available for KVM and ESXi hypervisors and for the Eucalyptus and OpenStack cloud controllers as well as on a number of other cloud providers. CoreOS is available as a download where customers do their own support through the open source community and is also available as Managed Linux and Premium Managed Linux, where CoreOS does all the patching and progressively more hand-holding as customers pay more. Managed Linux costs from $100 per month for 10 servers to $125,000 per month for up to 100,000 servers. Premium Managed Linux costs from $2,100 per month for up to 25 servers to $210,000 per month for up to 100,000 servers, and it includes on-premises installation of CoreUpdate and Enterprise Registry in the CoreOS licenses. The Container Registry is available as a service from CoreOS and is also available as an internally run tool within datacenters as Enterprise Registry. Container Registry pricing runs from $25 per month with 10 code repositories to $200 per month with 125 repositories; Enterprise Registry is priced based on the number of containers deployed, ranging from $10 per month for up to 20 containers to $299 per month for up to 100 containers.
In a future story in The Next Platform, we will compare and contrast the functionality of this CoreOS stack with that of other Docker container management systems as well as virtualization management systems that support Linux. Stay tuned.
While CoreOS is aimed squarely at production Web applications running at scale–as Polvi puts it, CoreOS is in play where containers are in play, and they are starting here–that does not mean that CoreOS will not see uptake in other markets such as supercomputing or financial services.
Polvi says that CoreOS is not cutting any corners on Linux kernel updates and that by being current it can deliver more performance and reliability. But such institutions often have their own kernel developers to squeeze the maximum performance out of their machines. As for these other markets, Polvi concedes that CoreOS could be the basis of a Hadoop data analytics or MPI-based simulation cluster, and it could even be used to run heavy enterprise workloads like Oracle databases if customers wanted. The only issue is having the software certified to run inside of Docker containers on top of Linux. Up until now, Hadoop and HPC workloads, to name two, have remained on bare metal machines because of the heavy performance penalty of virtual machines and their hypervisors (and often high costs for licenses or support), but with a lightweight container like the ones embedded in Docker, the CoreOS stack could see adoption in these other areas soon.
It is hard for CoreOS, the company, to figure out how many customers are deploying its software, and as enterprises move from dabbling with a few servers that are patched directly by CoreOS to deploying the management tools internally to run on hundreds or thousands of machines, CoreOS loses visibility on precisely how many machines are running its operating system. So it will be hard to track success in this regard. But these customers will also be paying support contracts, so CoreOS will be able to measure its success directly in dollars and anecdotally in happiness.