Before there were Internet-based search engines that anybody could use to look for anything, one of the toughest jobs in computing was helping people work through travel agencies to book flights, cars, and hotels when they travel. These reservation systems required their own specialized operating systems to provide high throughput and low latency, and they generally ran on the most expensive mainframe systems of their time. Travel was expensive, mainframes were good at I/O and transaction processing, so it all worked out in the end.
The deregulation of air travel in the late 1970s, which began in the United States but eventually loosened around the world, allowed airlines to expand their route coverage and more aggressively compete with each other. Labor laws and other regulations were loosened a bit to help airlines make ends meet at a time when the global economy was suffering from the aftershocks of the oil crisis. The ensuing competition lowered the cost of airfares, which meant more of us could go gallivanting and the airlines could make it up in volume. By the late 1980s, the airlines were under stress to find more travelers and cut costs, and they began to look for ways to monetize their reservation systems and eventually moved them off of mainframes and onto Unix systems. A decade later, the jump was from Unix to Linux. And today, if the moves being made by Amadeus, one of the major reservation system operators, is any guide, the jump is from bare metal or hypervisors to containers and platform clouds – in this case, Docker containers and Red Hat’s OpenShift platform cloud.
This path – mainframes to Unix to Linux to clouds – is one that many application stacks have taken in the financial, manufacturing, distribution, and public sectors over the decades. So it is not surprising that it is a similar path that Amadeus, which was formed in 1987 by Air France, Iberia Airlines, Lufthansa, and Scandinavia Airlines, has taken as it has evolved and competed against fierce rivals.
These rivals included Sabre (a spinout of American Airlines that owned Travelocity), Apollo (which was created by United Airlines), and Galileo (which was formed the same year as Amadeus by a group of European airlines). In the 1970s and 1980s, the airlines were fiercely competitive and used their reservation systems to try to best one another, and by the 1990s, with the commercialization of the Internet, the technical arms race accelerated as it became possible to compare fares and buy tickets without travel agents. To keep the revenue streams growing, those peddling services from reservation systems expanded to hotel room and car rentals and in some cases to train tickets. And today they are front ended by myriad web sites that help them push product and that also continually hammer their backend systems.
As we said, Amadeus was founded as a mainframe shop, and Unix was not really an option at the time, according to Dietmar Fauser, vice president of architecture, quality, and governance at Amadeus IT Group. While Sun Microsystems and Hewlett-Packard were selling commercial Unix machines, those systems did not yet have the scale of IBM and Unisys mainframes and companies were only just then starting to experiment with distributed computing as we know it.
“We know this will take time and that it is not a simple endeavor. We are not necessarily moving at the fastest pace because we really want to ensure The Next Platform is capable of hosting more complex applications.”
Fauser says that IBM helped at the very beginning letting Amadeus run its operations inside of its own facilities, but since then it has run inside of its own facilities in Erding, Germany, just outside of Munich. The company has a single datacenter plus a backup on the other side of town but does not, as yet, distribute its processing around the globe or on public clouds. Fauser says that Amadeus anticipates that it will eventually need to provide more processing locally, and is rejiggering its infrastructure so it can be easily replicated if necessary in remote datacenters or on the cloud.
The core of the Amadeus platform ran on IBM’s Transaction Processing Facility (TPF), a specialized operating system expressly created by Big Blue for online reservation systems but also used by big financial institutions like MasterCard and VISA. The initial price quotation systems for Amadeus were based on Unisys mainframes, which came from Air France, and the core reservation system was provided by a spinout of Eastern Airlines (now long defunct) and was called System One.
“We were created as a mainframe shop with essentially IBM technology. Most of these systems were decommissioned as the applications moved to Unix,” says Fauser. Amadeus was a big Hewlett-Packard HP-UX and Sun Microsystems Solaris shop for a long time – and then Unix was shifted to Linux. Amadeus uses Red Hat Enterprise Linux for databases and SUSE Enterprise Linux Server for application servers. As more applications move into OpenShift, they will shift to RHEL from SLES. At the moment, Amadeus is planning to use just the straight RHEL license and is not using the Atomic Host distribution of Linux tailored specifically for Docker containers.
“For the core systems, we started pretty early, in 1998, to build the first bits and pieces on Unix-based systems,” recalls Fauser. “Pretty quickly we created our own enterprise service bus and our own application servers, all written in C++ on Unix, and we have gradually built significant know-how in large distributed systems. By large, I mean something like our seat availability and pricing application, which is running on over 55,000 cores.”
The reason why this application is so large – on the scale of a supercomputer or a chunk of a datacenter at Facebook or Google – is that online travel agencies like Priceline.com, Kayak, and so forth are constantly hammering their systems.
“A single query multiplies enormously internally into very many potential solutions over a date range over a geographical area with many airports,” explains Fauser. “This is why there is such extreme transactional load on these machines, and why we understood pretty early on that we needed to go to distributed systems. We very early introduced advanced in-memory caching like NoSQL and Memcached to shield the databases from these extreme volumes. We have gradually built deep know-how about distributed Linux environments and have gradually moved mainframe features onto those open systems. We have our last mainframe applications – residual TPF stuff, passenger name records – still running and expect to shut them down by the end of next year. It was a very long technical project to do this.”
Fauser was not at liberty to say how many X86 cores Amadeus has to play with it total, but it is multiple tens of thousands of cores running on many thousands of servers, he did confirm. Virtually all of the airline applications have been ported to Linux and all hotel applications are native to Linux. Pieces of the global distribution system are still on the mainframe as well, but by the end of next year, Amadeus expects to turn off its last mainframes. Amadeus takes its time with platform transitions – like a decade or so.
“We have strong Linux skills, and we have written a lot of our own infrastructure software because open source solutions were not available when we started development. Hence, the next step in our evolution is using emerging technologies like Kubernetes and OpenShift that will, to quite a great extent, simplify our operational environment and give us more capabilities to manage it in a much more homogenous way.”
The big applications at Amadeus run on bare metal, but the company is also a big user of VMware’s virtualization tools, just like the other 500,000 organizations worldwide who use ESXi to virtualize their systems. Docker containers will not necessarily displace all of the VMware virtual machines, so don’t think of this as an either-or situation. Just like Google Compute Platform runs Docker containers atop KVM virtual machines, Amadeus can deploy OpenShift instances on top of VMware ESXi hypervisors.
“We like to have our eggs in several baskets,” says Fauser, without detailing precisely what the plan might be. A lot depends on how VMware develops its own OpenStack, Cloud Foundry, and Docker products, he says. Red Hat is getting some of the business now with OpenShift, but Amadeus is not using the Atomic Host minimalist variant of RHEL to provide Docker containers for its software.
While the core reservation systems at Amadeus are running on Linux (with a touch of mainframe and a few Solaris machines soon to be decommissioned), there is a stack of applications that are written mostly in Java that run on Microsoft Windows Server environment in conjunction with Oracle’s WebLogic or Red Hat’s JBoss middleware. Over time, these Java applications will be moved to Linux, containerized in Docker, and plunked onto the OpenShift platform cloud, which will be the central scheduler and controller in the Amadeus datacenter.
“We know this will take time and that it is not a simple endeavor,” Fauser explains. “We are not necessarily moving at the fastest pace because we really want to ensure The Next Platform is capable of hosting more complex applications.”
The airline industry is careful and cautious, and we all like it that way. The one thing Amadeus doesn’t want to do is disrupt its business, which has been growing at compound annual growth rate of over 7 percent in the past five years, resulting in €3.42 billion in revenues in fiscal 2014 and €681 million in adjusted profits; the company spent €527 million on research and development, which is about what you would expect from a software and services company that was heavily investing in itself, and that investment is one of the reasons why Amadeus has around 40 percent market share of air bookings.
Staying On The Enterprise Service Bus
One thing that is not going to be replaced any time soon is the homegrown enterprise service bus that Amadeus created many years ago. That is because it has to be able to support a wide variety of protocols that its partner ecosystem has deployed. Believe it or not, the organization still has customers that use IBM’s System Network Architecture (SNA) and the X.25 packet switched WAN protocols, just to name two.
“It is not just a typical Web 2.0 environment where everyone is just speaking HTTP,” says Fauser. In fact, Amadeus created its own protocol to run over TCP networks, which was invented ahead of HTTP/2 but shares many of its features.
That enterprise service bus, which is at the heart of Amadeus reservation system, manages the configuration of connections between Amadeus and its customers and it will be moved to OpenShift over time; it is expected to require many hundreds of thousands of containers, all managed in pods using Kubernetes. This enterprise service bus is hosted on a cluster of 140 machines and pushes around 250,000 transactions per second on a normal day.
“The work that we are doing with Red Hat is to ensure that the orchestration layer with Kubernetes is capable of being plugged into our environment and that we can publish our service endpoints and make it work for existing enterprise-scale applications. A lot of the work we are doing is to adapt the Kubernetes orchestration layer with our own management and configuration capabilities.”
Amadeus will be open sourcing parts of what it is developing with Red Hat into the OpenShift community, but the elements of the service bus are so particular to Amadeus that opening up would be largely useless to others. (This is precisely the reason why the Kubernetes container scheduler is inspired by Google’s Borg cluster and container scheduler but is not literally the code that comprises Borg.)
The company was an early adopter of the services-oriented architecture (SOA) approach to application creation and has more than 5,000 microservices (they were not called that back then). Amadeus has a few large applications and then lots of smaller applications, and it makes economic sense to move the big ones first. “As for the rest, it is a matter of funding and opportunity. Sometimes we do it when we have to evolve an application and that is the right moment to move to a new computing environment. In a nutshell, for sure this will take us a small number of years.”
The other motivating factor to try to figure out what applications to move first to containers running on OpenShift is which applications could be moved to a public cloud. “We want to be able to decouple from the hardware. What we push to OpenShift can run on Amazon Web Services, it can run in a co-located environment, and it can run on VMware. From a design point of view that is a very important capability because it gives us the freedom to operate.”
Amadeus has a number of broader goals as it implements Docker containers and the OpenShift platform cloud to abstract and encapsulate its software and to automate the deployment of that code. It is not just about lowering costs, however, which is what everyone always thinks it is.
The one thing that a move to OpenShift and containers will provide Amadeus is consistency across a wide variety of platforms. So a developer working on a laptop should be able to deploy those applications on public clouds or the internal OpenShift cloud seamlessly, relatively effortlessly, and consistently.
“The goals are manifold, and for many applications, the goal is higher availability, and not to forget decreased costs and flexibility,” says Fauser. “We want to, as much as possible, to take humans out of incident management situations so The Next Platform will react faster and be capable of choosing resources in a more dynamic way than we currently do. For sure we want flexibility and a higher degree of uniform operations. There is a certain degree of divergence between the e-commerce environment and the open systems. Having a single platform means less training for operations and the people who look into incidents. It is a higher efficiency, and from that, we also expect lower costs. But lower cost is not necessarily the key driver, but it is an expected outcome. And then there is a fundamental readiness for potential future business demands, say, if we have to host parts of our computing environment closer to the source of extremely large transactional environments. We want the capability of applications that can run geographically distributed.”
Fauser is not sure how many containers Amadeus will eventually host on its cluster, but he does say that it will be a very large number given the fact that the company will not host many processes together inside of containers. “When you take applications that run on over 50,000 cores and package them up on a per-process basis inside of containers, you will very quickly end up with a very large number. We are far from this number as we speak, but it will eventually be a very large number of containers. Also, to give you an idea, the master agents that manage the deployment units on the enterprise service bus is in the range of 350,000 units now and I would expect that we will reach an equivalent number of containers in the future.”
This will make Amadeus one of the largest commercial deployments of Docker containers in the world, and one of the innovators driving its adoption in the datacenter.
Sign up to our Newsletter
Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
It’s worth pointing out that TPF was an early adopter of “fail fast” methodologies. The cut down mainframe O/S had little in the way of fault tolerance (so would crash without taking lots of diagnostics), but could be booted back up in seconds. Pretty much the model we’re looking at today for containers & webscale architectures.