The Unlikely Marriage of Databases and Object Storage

The vast swathes of unstructured data that now reside in the cloud has changed the nature of information technology in many ways. One of the more significant is that it has led to the widespread use of object storage as a data repository for things like video, images, and audio. And thanks to the ubiquity of this type of storage in cloud environments, it has more recently become the object of affection for database applications.

One of the knocks against object storage is that while it’s fairly good at delivering throughput, it’s not so proficient at supplying IOPS. That basically ruled it out for high-flying database analytics, not to mention machine learning and other types of I/O-demanding applications.

Enter MinIO, a company that has developed an open-source object storage system for private clouds that offers high levels of both throughput and IOPS. (If you’re wondering about specifics, the company is not shy about publishing performance benchmarks.) And since it’s built around Amazon’s S3 API, arguably the de facto standard for cloud-based object storage, the platform allows you to move your application to pretty much any type of cloud environment without a whole lot of effort.

At our Next IO Platform event last month, MinIO CEO and cofounder Anand Babu (AB) Periasamy dug into why databases are migrating to object storage and his company’s role in that growing trend. Periasamy, you may remember, was the initial developer of GlusterFS, a cluster file system that was designed to bring unstructured data into a more conventional POSIX-style platform.

At some point, Periasamy concluded that POSIX was not going to be the optimal technology for the internet age, since he came to realize it was ill-suited to performing file manipulations across these vast storage networks. He considered incorporating an S3 gateway into the file system as a way to offer a unified platform. Soon though, he realized that the two models were fundamentally incompatible, since the S3 gateway added too much baggage to the file system. “You would end up building a mediocre file system and a terrible object storage system,” he explained.  GlusterFS did, however, finds some adherents in the cloud space, and was acquired by Red Hat in 2011.

Eventually Periasamy went back to the drawing board and decided to develop a distributed file system built exclusively around S3. Thus was born MinIO. According to Periasamy though, they did not set out to capture the enterprise database market when the company launched in 2014. As it turned out, that was just a fortunate confluence of events and trends.

From his perspective, object storage was just for unstructured data, while databases were about storing “mutable metadata.”  And compared to the enormous amounts of unstructured data in the cloud, these enterprise databases were comparatively small. “What I learned in the last three years was that it’s not actually photos and videos that’s consuming petabytes of storage in these organizations,” he told us. “It’s actually metadata at petascale.”

What he’s referring to are every-increasing stores of financial transactions, event logs, and other types of journaling that businesses are accumulating and now analyzing. As these databases grew, their keepers discovered they did not scale well with the traditional file and block model. At that point, they began to turn to object storage, which began in the public cloud. According to Periasamy, most of the analytics engines today – Snowflake, Azure ML, Power BI, SageMaker, BitQuery, and a host of others – are now tapping into object storage.

More recently, this database-object model has spilled into private clouds, said Periasamy. That’s when he noticed that a number of these engines were being positioned on top of MinIO. Apparently, they first became of aware of this with banking customers. “The ones that surprised me the most was transactional databases started looking inside object storage,” he said.

By bringing performance to the object storage table, MinIO is likely to keep accumulating users as more enterprises embrace the object storage model for their datasets. The MinIO server, client and software development are available under the Apache license and can be downloaded for free.  Of course, if you want additional software support services, MinIO is more than happy to sign you up for a subscription.

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

Be the first to comment

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.