Where Latency Is Key And Throughput Is Of Value

If hardware doesn’t scale well, either up into a more capacious shared memory system or out across a network of modestly powered nodes, there isn’t much you can do about it. But when software runs out of scale, there always seems to be a new crop of techies who take a new look at a problem and figure out a way to make the software handle the load. Usually across distributed systems, but not always.

That, in a nutshell, was the inspiration that Srini Srinivasan, co-founder and chief product officer at Aerospike, brought to bear after having to use Oracle relational databases back in the earlier days of the commercialized Internet back in the 2000s when he was senior director of engineering at Yahoo when the iPhone debuted with a slew of Yahoo apps on it. Srinivasan was a speaker at our recent The Next Database Platform 2020 event and talked a bit about the history that impelled the creation of Aerospike, which got its start in adtech but has since expanded its homegrown key-value store into financial services and telecommunications and other industries and use cases where low latency and high throughput are, well, key and of value.

“The iPhone launched in 2007, and at that time, I ran into all sorts of problems for workloads that required real-time access, both for reads and writes,” explains Srinivasan. “That resulted in the idea of founding Aerospike, where our main goal was essentially to build a new transactional database that was able to handle very high throughput and low latency, but also to provide high uptime for mission-critical, consumer facing applications. And that is indeed what we have done. We have systems that have run at scale for almost ten years now, at millions of reads and writes per second with hundreds of terabytes of data.”

Aerospike, along with the hyperscalers that had created their own databases, was quick to see the value of using flash memory as a sort of extension of main memory rather than as just a fast cache for slow disk storage. The foundation of the Aerospike platform was based on stripping down the key-value store and tuning it up to make use of multicore processors, main memory, and networking across a cluster to present a fast database engine. The Aerospike hybrid memory architecture stored database indexes in DRAM and data is read directly off flash. The flash allowed for what Srinivasan called a new kind of scale up for database memory, and even with only a few hundred gigabytes of DRAM, the initial Aerospike database could store terabytes of data in a single node and get real-time access of data for both reads and writes. The shared-nothing architecture of the database allows for it to scale nearly linearly – in terms of both capacity and performance – as multiple nodes are networked together.

“I think our original decision to leverage flash was a good one, and we have seen further compression of cluster sizes with the advent of Intel’s Optane persistent memory,” says Srinivasan. “What that has allowed us to do, instead of storing a limited amount of index data in DRAM, we can now have up to 6 TB of persistent memory where we can store the index. So we extended our hybrid memory architecture from using DRAM and flash to using persistent memory and flash. We continue to make more progress on the networking with new hardware, and new kinds of storage architectures that will be hopefully arriving in the future.”

We spoke to company chief executive officer, John Dillion, along with Srinivasan about the Aerospike installed base and its business overall back in September, so we are not going to get into all of that again here. It is worth reiterating, however, that one thing that is a central value proposition of the Aerospike Database is that if you have hundreds of nodes of a database or datastore supporting real-time applications – think Cassandra, Couchbase, Redis, Memcache, and kdb+, just to name a few – it only takes tens of nodes to support the same workloads on Aerospike. This is a tremendous consolidation of hardware and the removal of a large number of software licenses to boot. The savings for many customers are millions of dollars per year, and that helps make the TCO argument a whole lot easier for the Aerospike salespeople. But moreover, the performance of these databases is really only good when all of their data is stored in DRAM, says Srinivasan. With the DRAM and flash hybrid memory mix, the compression ratios are on the order of 5X to 10X compared to these other datastores, and with Optane persistent memory on the Aerospike nodes, the compression goes up by another 3X to 4X. So, if you combine the two and average it all out, customers could see a 20X compression factor using Aerospike over alternatives.

People will do a lot for a 50 percent or 100 percent performance shift. Being able to get rid of most of your servers and still handle the workload is almost unimaginable, and it just goes to show you how much improvement can come through software, even here in the 21st century.

Aerospike essentially creates a key-value store using a distributed hash table, which is designed to hold hundreds of billions to trillions of objects, and the idea is to be able to access any object in a fixed amount of time – generally, in under a few milliseconds in the case of Aerospike. (For many applications, predictable and consistent latency is more important than having low latency most of the time and terrible latency some of the time, which happens with a lot of databases.) In a use case involving fraud detection, such as at PayPal, which is a big Aerospike customer, to score a transaction in the fraud detection system, accessing hundreds to a thousand or more pieces of disparate bits of information to try to determine if the people doing the transactions are who they say they are; you have somewhere between 100 milliseconds to 200 milliseconds to score the transaction, which is about the limit of human patience in the Internet era.

“The millionth customer has to get the same experience as the first customer as the hundred millionth customer,” as Srinivasan puts it. “And that is the challenge for these applications, and that is the challenge Aerospike has fulfilled.”

To get this kind of performance with a traditional relational database would require caching on its front end, which is difficult to scale across databases and keep synchronized, and the beautiful thing is that Aerospike is basically as fast as these caches and is in fact the actual database. You can throw that whole front end and the back end from the past out the datacenter door. People have tried to make do with these approaches, and Srinivasan says that he was not different when working at Yahoo way back when – and that was because it was the only choice. But companies have lots of other choices today and all they need to do is make the right one.

With the modern Aerospike Database 5, the company is moving out from its core business of being the system of record for transactions, crossing over to the data analytics systems that sit beside them and pushing out to the edge where a certain amount of processing needs to be done. If you want to know more about that strategy, you will have to listen in to the conversation above.

Where Latency Is Key And Throughput Is Of Value

Sign up to our Newsletter

Be the first to comment

Leave a Reply Cancel reply