If relational databases had just worked at scale to begin with, the IT sector would be a whole lot more boring and we wouldn’t be having a conversation with Andrew Fikes, the vice president and Engineering Fellow at search engine, application, and cloud computing giant Google who has been instrumental in the creation of many of its databases and datastores since joining the company in 2001.
Luckily for us, then, the database sector is just as interesting as it has always been because the applications that drive them keep changing, getting more complex and more demanding over time. Google’s Spanner is one of the most sophisticated, flexible, and scalable databases ever created, and it has spawned a clone called CockroachDB, which is also getting traction among enterprises and which is competing against Google’s Cloud Spanner service on its public cloud. There is also a certain amount of competition with the SQL database layers atop the Hadoop stack, various NoSQL datastores, and the ever-improving, enterprise-grade relational databases with columnar stores and in-memory processing.
It is an exciting time – still – in the database world.
TPM: Many of the technologies developed by Google are focused on doing something at scale and at the same time insulating people, who are trying to process data or query it, from the complexities of the underlying data store or database. Whenever I think about all of these databases and datastores, I always want there to be just one database, and I’m sure everyone is tempted to think that there can be just one. But every time you think you have got something that can do everything, there’s a new thing it can’t do.
Andrew Fikes: Let me back up a little bit and walk you forward to how we got here. I’ve been here at Google a really long time – it’s almost 18 years now – and I’ve been able to observe over time as Google itself has grown. When you sort of look at all the systems created by Google, the first thing you have to realize is the initial set of engineers that joined the company were what I would call primarily distributed systems engineers. They have a system engineering background, not a database background.
So as we went into data storage, a lot of it was from the systems perspective, and it was always done with a particular goal in mind. When we went out to build BigTable, our first use case was to store the Web. And it was a very simple thing drawn on a whiteboard: The URL is the key, and for each one of those URLs we wanted to store document contents and we wanted to be able to annotate those document contents with things like PageRank. These were very simple things at the time, and it drove this idea of we needed something that would scale really big.
And what was interesting about it at the time was that most of those items by themselves were standalone and could fit on a single machine. You could imagine a row sitting on a machine, and the rows and the machines were relatively independent of each other. This is around the time that MapReduce showed up to process this large dataset. But over time, Google itself has changed. We’ve gone from being a search business to an ads business to a geo business to an apps business. In the context of Spanner, when we first started looking at the problem after realizing that BigTable is not enough – it was generally in the context of people trying to build apps.
There were a couple of different problems. You’re trying to solve the horizontal partitioning –being able to take parts of the row space and place them in different parts of the world. This is one of the features you see with this multi-region capabilities in Cloud Spanner today. And we also started to see that the application logic that was built on top of these databases and datastores was sort of significantly more complex than what you can sort of rationalize with eventual consistency. We went through this period of time where smart people were trying to help apps people build things on top of an eventually consistent system, BigTable, and realizing we needed more.
That helped us bust through a whole bunch dogma. At that point in time, I was a huge believer in eventual consistency. I was a huge believer that transactions weren’t something that we should have in distributed systems. And when we actually tried to use these systems, we started to realize that that consistency was a power we actually needed. What’s interesting about that is that Spanner has taken us on a journey that’s pulled our distributed systems closer and closer to a database. And so we start to pull a little bit on that database culture and lore and experience.
For instance, we weren’t strong believers in SQL at Google. And now I’m very happy to have typed keys, and SQL has become sort of the lingua franca inside of Google for accessing data and we’re seeing it pervade our systems more and more. And as we added SQL into Spanner, which we did with our brethren, the F1 team in ads, where we ported their entire business critical database, which was in MySQL, onto Spanner, where we worked with them to build an SQL engine. You have seen us pull on that even further right, re-examining all parts of our stack to sort of understand how to implement SQL better.
TPM: I’m dying to know why CockroachDB, which is an open source database inspired by Spanner and created by some former Googlers, should have all the fun. And I wonder why Google can’t pull a Kubernetes and just open source Spanner. It would be nice to have one version of this, not two.
Andrew Fikes: I used to share a cube with Peter Mattis and Spencer Kimball, two of the founders of Cockroach along with Ben Darnell, many years ago and it has been exciting to watch them build Cockroach. We actually talk with them quite a bit and see what they’re up to. The way I approach this particular question is this: I think it’s important for us to get to a point in this space where we have really open, compatible APIs and we are able to leverage those APIs with multiple implementations, and those implementations can be specialized for certain things. I do think data islands are not something that that our customers necessarily want. They want to be able to query over multiple data sources and bring them together. I see us all fitting together kind of in that ecosystem.
TPM: Do you have to specifically do anything more than supporting SQL in Spanner to do that? Is that enough?
Andrew Fikes: When we started implementing SQL in Spanner, I came out as a systems person and I thought, well, there’s an SQL standard, right? So clearly what we should do is we should go implement the SQL standard and we will be able to communicate back and forth. And this was this was a clear learning moment for me. I spent hours in front of big PDF documents, scratching my head at how underspecified SQL could be, and then booting up different database engines to learn how each one behaved.
This is what gave rise internally to the SQL language that is used both in Spanner and in BigQuery. Within Google, we now have a standardized single SQL language that has common expression evaluation, common algebra, typing, and those sorts of things. So the first goal that we had was to consolidate all of the different SQL dialects in use within Google around this common language, and that’s what we’ve been able to bring to the cloud.
I think the next question in that journey is where does that language go? I think we’re seeing a number of other folks actually start to look at you know what does it mean to provide Postgres or MySQL compatibility on top of larger data engines. I think that’s really interesting because it does allow people to work with existing frameworks. One of the things that I go back and forth on in my head is that even if we match syntax compatibility, byte for byte in some sense, there’s a lot of the way applications interact with databases where they’re hidden semantics. I think this primarily comes down to performance and latency. If you take a database that is two or three times as slow and you put a Postgres layer on top of it and it doesn’t perform in exactly the same way that the user expected, this is not good. And so I don’t think it’s quite enough to specify just the SQL. I think you’ve got to you’ve got to think a little bit more about those hidden semantics as well.
TPM: How close is that Google SQL to the ANSI standard – you pick one? [Laughter.]
Andrew Fikes: You know all languages are a consequence of their environment in some sense. This one grew up because Google internally uses protocol buffers, which is a data format we’ve externalized. And so a lot of the typing and other sorts of things are made to marry well with it. You will find that the set of types more closely resembles a set of types and protocol buffers, and you see these expressed in BigQuery as well as in Spanner. But in general, we try very hard to look at other engines, understand their behavior, tried to be more compatible with the ones that were most in use within Google at the time – which would be MySQL for example – and really make it so that there aren’t any surprises. There are some things like more natural support for arrays, too.
TPM: What is Spanner used for internally at Google – is it everything at this point? Do you still have BigTable and other things running?
Andrew Fikes: The short answer is that Spanner is used for darn near everything. But Google is all of these separate entities, and this creates a variety of workloads. So when you look at Spanner’s user base, we have things that look more like traditional enterprise use cases. This would be the ad business, although I would say that their data scale is probably larger than most for an OLTP database. We have workloads that I would call traditional indexing workloads. So these are the big workhorses that fill whole datacenters. We have workloads that are very high end user low latency focused, such our G-Suite portfolio. We have things that are sort of in that high availability, criticality space, such as all of the control planes that run Google Cloud, which are using Spanner. So when you get Cloud Spanner, you get a battle-hardened, available database that Google is going to put all of its weight behind because it’s the most important thing to the company.
TPM: Now, you know as well as I do that this may not be the case in the long run, but in the short run, companies are thinking in hybrid modes. And the only way to run Cloud Spanner today is on your Google Cloud platform public cloud. CockroachDB is not precisely the same thing as a local version of Spanner –it’s close, I guess. Are you getting any kind of pressure from customers to let Cloud Spanner run locally in their datacenters?
Andrew Fikes: There are a couple of things to consider here. One is, I haven’t looked in a while where CockroachDB is and its ability to support different workloads, but I would venture to guess that we have quite a bit of diversity, especially on the scale spectrum. Second, I do think the hybrid question is one that we’ve got to figure out. To go back to your original point – is there one database to rule them all – my guess is probably not.
TPM: But Google actually demonstrates the principle that one database could do most things. To be fair, you have the luxury of controlling your own applications stack, completely and thoroughly. You have the luxury of having I don’t even know how many PhDs and software engineers. You have the luxury of a very profitable search engine and advertising business that fuels that need and investment in Spanner.
Andrew Fikes: To me, one of the great things about bringing Spanner to cloud is that we’ve actually been able to pick up whole new types of enterprise workloads. They have, for instance, a factory floor with a database, Google has a database in the cloud, and we have to make these work together. We figure out how to get data collected on premise if that’s the right way to do it, figure out what gets stored in the cloud, and figure out how to create federated queries across those two. Then we figure out what data could be brought to the cloud to do machine learning. As for what does Google want to do, long term, in storing data on premise, to me this is still kind of an open discussion.
TPM: I think it’ll probably come to a head. I think it will take a while. But not too long. This hybrid thing is happening and my initial take on it was it will be hybrid until they see the economics of this all and grow frustrated with running their own infrastructure. In other words, like the outsourcing wave in the mainframe market in the late 1980s and the early 1990s where it just looked like for a while that everyone was going to pull the plugs and close down their datacenters. But It didn’t happen that way. People have data sovereignty issues and other issues.
The cloud seems to be swinging back harder towards hybrid, the sentiment that I get from talking to people – and I know that Amazon has all these customers who say they are all-in and I always feel like if you dig down they’re not all-in. Born in the cloud is different, you know they are all-in. It’s hard for me to envision certain kinds of workloads moving. I mean first of all, the cost of moving data to the cloud is zero. But the cost of moving it back is really, really steep. The computing is damn near free, but the storage and not the networking is not. And you know one of the benefits of having your own datacenter is you’re moving it within your datacenter and it doesn’t cost anything but the price and the time to move the data.
Andrew Fikes: We are exploring this right. We are trying to understand what that bridge looks like. And being a storage guy, I certainly agree with your sentiment right. Storage is a challenge. You know there is a value to locality and understanding that locality.
There are a couple of trends to think about. How the growth in data volume has changed over time, and how that impacts the cloud’s role is in this. Think of those traditional machine learning use cases where we are harvesting large data sets. Security is also a big thing on my mind, helping to be part of the solution to keep a good chunk of the world’s data secure. I think Google offers – and cloud offers – a really good way for us to get some consistency around practices and making sure that we’re able to handle those things. Iti s a world of tradeoffs.
Consider this. Why did we make a big shift from eventual consistency to strong consistency? Why did we make the shift from untyped data to typed data? Where does SQL play in all of this? These changes were made to give developers better tools. To get them back to work. I think we are going to have to figure out how to take that infrastructure discussion to on premises locations and figure out how to how to create these environments and get them working well.
TPM: I could not agree more. In the long run, you are going to have something that looks and smells like Spanner running on premises – and it may or may not make Spencer and Peter happy. I remember times when people tried to offer compatibility, and did it badly, and it didn’t work. I’m thinking about the Eucalyptus cloud controller, which was billed as being sort of compatible with the API stack of Amazon Web Services before OpenStack, which initially promised the same compatibility and then changed its mind. The point is, Eucalyptus could never keep up, bless them for trying.
I just think in the long run what people will really want is the exact same thing scaled down as appropriate, running in their datacenters, just like Google allowed by open sourcing Kubernetes. It’s an interesting thing to ponder.
Andrew Fikes: I do it daily on my drive to work. My head is abuzz with all sorts of things, and that is definitely one of them.
TPM: Is eventual consistency dead? Nobody really talks about Cassandra, the NoSQL database created by Facebook and inspired by BigTable, as being all that useful, at least among the people I talk to and particularly with all of these other databases on the rise.
Andrew Fikes: So the metaphor I like to use here, which I actually think is the right one and the one that I have arrived at after years of thinking about this, is that oftentimes when you write code you write it in the most straight line way possible. Your goal is to figure out how to get from A to B. And then you run that code a whole bunch of times and you figure out that a particular method or particular piece of it is worth optimizing. And that’s really where I think of eventual consistency belongs, in that sort of optimization phase right.
What I suggest to people is you know our goal is to start with you know leveraging strong consistency transactions and all of that power, which are relatively hard to do in the scale that Spanner brings to light. And then when you find places where it’s appropriate, where you have very specific use cases, pop off into eventual consistency. Go ahead and take the choice of that harder maintainability at that time, but treat it as a pop out instead of the default.
TPM: High availability clustering is like this. You can do asynchronous or synchronous replication, and there were times when it must be synchronous and there are times when asynchronous is fine. I would use synchronous replication locally within one datacenter or region, when when going across regions or geographies, use asynchronous. I think it’s a similar kind of scenario.
Andrew Fikes: Yes. Here is another thing. We built a BigTable and we didn’t have transactions in it. There was a splitting and merging protocol in there. which all of us – and you know with all of our amazing backgrounds – probably wrote it five times. And we wrote it five times because no matter how many times we handled those transactions and thought we knew what we were doing, actually still generated bugs. We have got them all worked out. But one of the things that happened in Spanner is that by building a strong primitive, which was transactions, and then building things upon that, our ability to build more complex applications and more complex distributed systems increased. So the strong consistency in transactions can be a real multiplier on your on your ability to produce more interesting things.
TPM: Is there any application you want to run or have thought about running on Spanner that you can’t?
Andrew Fikes: We have this this blend between OLTP and OLAP, and we are still, as you said, asking: Is there one database to rule them all? We sort of go back and forth on that. I think the pure OLAP workloads, from an efficiency point of view, do work better on things like BigQuery, which have been more multipurposely designed for them. You do see things like external databases out there that do take advantage of some tricks to blend the two of them. But within Google, we have seen pretty strong adoption across the board for Spanner, from the big to the small to the expensive to the not expensive to the SQL to the point queries. We have been working for quite a while filling in all of the gaps that that show up in the different workloads.
TPM: Is there a scale difference in terms of capacity, performance, latency between BigTable and Spanner? If you gave them the same work, will one run faster than the other, will they run with the same, will they require the same resources for a given dataset size? How do they compare and contrast?
Andrew Fikes: The people that built BigTable went off to build Spanner, so there are parts of them that are very similar. BigTable has a longer track record of supporting the state indexing workloads that I talk to you about, and so it has a few more optimizations that towards that large data in spectrum. But at Google, as Spanner use has grown internally that’s been a focus and you know that gap continues to close.
TPM: So is the search engine job running on a mix of BigTable and Spanner? Or is Spanner just running the control planes for search?
Andrew Fikes: The larger search engine is a big machine, the parts of which I used to understand and no longer understand. And I think the right way to say it is, today many, many parts of that big machine run a combination of the two.