The Ever-Embiggening Humongous Document Database

Moving its eponymous NoSQL document database to the cloud and running it as a managed service has been a watershed event for MongoDB, which like a number of its peers in the broader database market are growing at the expense of relational databases that can’t scale as well for certain workloads. They are also growing at the expense of each other as they bring different approaches to storing and querying large amounts of data.

Just because the relational databases from Oracle, Microsoft, and IBM are easy targets because of their high costs of ownership, and despite the open source MySQL and Postgres databases growing up to scale pretty well for a lot of transactional workloads, that doesn’t mean that makers of alternative NoSQL databases have an easy time growing their businesses. As we have pointed out before, databases are perhaps the stickiest technology in the datacenter, given that they hold the vital information that runs the business. Companies are understandably risk averse when it comes to moving off their databases onto something new. But new applications – particularly those that have to suddenly scale, and do so cost effectively – allow for new companies like MongoDB to get their foot in the glasshouse door. And, from the looks of things, this is precisely what is happening for MongoDB, which is hosting its MongoDB.live virtual event this week from its headquarters in New York.

MongoDB is, of course, one of the innovators in the NoSQL movement and like other futuristic databases, it came out of the ad serving business that has funded the commercial Internet buildout. Specifically, MongoDB has its roots in the DoubleClick ad serving business that was founded by Dwight Merriman and Kevin O’Connor way back in 1995, with Kevin Ryan becoming a co-founder shortly thereafter, to create a centralized way of serving up banner ads across the Internet. DoubleClick went public in 1998, was acquired by private equity firm Hellman & Friedman for $1.1 billion in 2005, and sold to Google in 2007 for $3.1 billion. It is the heart of Google’s ad serving business, even to this day.

Eliot Horowitz was a software developer in the research and development arm at DoubleClick for several years when he came up with the idea for a “humongous” database that could scale well on DoubleClick-class workloads, and he formed a company called 10gen as chief technology officer with Ryan, became the company’s chief financial officer, and Merriman, who became its chief executive officer. In 2009, the year of the Great Recession when so many technologies were changing, the humongous database was fired up as a platform as a service offering by 10gen, and eventually the company and the product was renamed MongoDB and the code was open sourced so companies could deploy it in their own datacenters. MongoDB raised a dozen rounds of funding for a total of $311 million, led by Union Square Ventures, Sequoia Capital, and New Enterprise Associates, and achieved unicorn status – having a valuation larger than $1 billion – well ahead of its initial public offering in late 2017, when it raised $192 million by floating some of its shares on NASDAQ.

Today, MongoDB, the company, has a market capitalization of $12.1 billion and it has been adding customers on the public clouds like crazy with its Atlas database service, which was launched in June 2016 and which brings the company full circle where it started as a platform as a service (PaaS) document database when it was formed more than a decade ago. Dev Ittycheria, who was a co-founder of configuration management software maker BladeLogic back in 2001, was named CEO at MongoDB in late 2014, and with the current team in place, Horowitz stepped down as CTO in March of this year – the last of the co-founders still working in the day job – and remains an advisor to the company.

As is the case with many ambitious software companies driving distributed computing technologies with open source software, it is relatively easy to build a vast base of users who have downloaded the software for on-premises testing or even use or who are running it in the cloud in an on-demand fashion. But it is difficult to build a base of customers on enterprise subscriptions sufficient to not only pay the bills, but to turn a profit. To make things even harder, Amazon Web Services last year launched its own document database, called DocumentDB, that is compatible with MongoDB but not based on the MongoDB code. It is no surprise, then, that Google Cloud is the new best friend of MongoDB and is the fastest growing cloud for deploying the Atlas variant.

In the first quarter of fiscal 2021 ended in April of this year, MongoDB had 18,400 customers, of which 16,800 of them were running its Atlas service on the cloud. Back when Atlas was launched four years ago, MongoDB had somewhere around 2,000 customers, we estimate, and from its filings ahead of its IPO we can see that it was getting an average of $11,356 per customer for its MongoDB Enterprise Distribution. That number is still holding at around $10,323 per quarter for customers who are only using the MongoDB Enterprise distribution (either on a cloud or on premises), but the revenue for the average Atlas customer has grown by an order of magnitude over the past four years and was just under $4,000 per quarter per customer in the trailing twelve months. Again, there are many customers who run MongoDB Enterprise as well as use the Atlas service, and we cannot tease their behavior out of the financials that the company provides to Wall Street. Part of that relatively high number for Atlas customers could be MongoDB Enterprise use married to Atlas use. We don’t know, and when we talked to Richard Kreuter, chief customer officer at MongoDB (the company) this week, he was not at liberty to elaborate beyond the public financials.

During its fiscal third quarter of 2017, the first one for which full numbers are available, MongoDB had $26 million in revenues and posted a $19 million loss. While the number of customers running either Atlas alone or MongoDB Enterprise plus Atlas have grown (we have to infer the last one since it is not reported), the number of MongoDB customers who are not running Atlas has shrunk a bit to 1,600 in the latest quarter. But MongoDB has grown considerably over those four years, and in the first quarter of fiscal 2021, the company had $130 million in revenues and had a $977 million pile of cash and equivalents. MongoDB might be spending more money than it is making as it expands the company, but it has plenty of cash, relative to its revenue stream, in the bank with which to do this.

Here are the financials for the past six fiscal years for the company:

As you can see, the company has consistently and methodically grown revenues at around 60 percent, with gross profits more or less keeping pace, and MongoDB has kept the losses from ballooning at anywhere near the rate of revenue growth until the fiscal 2020 year when it started pushing extra hard, driving up sales, marketing, research, and development costs. It takes money to make any new technology go mainstream, particularly when there are many options in the database market as there are today.

The important thing for MongoDB – and the big threat that the DocumentDB service from AWS represents – is how subscriptions and fees for the Atlas service have grown over time. It was peanuts four years ago, and in the trailing twelve months, it represented 42.2 percent of subscription sales, or $186 million out of a total of $441 million. That percentage keeps inching up, even as there is direct competitive pressure from AWS with its MongoDB lookalike.

At this point, the delivery of new functions in the MongoDB document database are being introduced first in the Atlas cloud service version, with the expectation that they will eventually be pulled into the MongoDB Enterprise edition that customers buy and manage themselves. Two features that have been in beta testing since last year and that are generally available starting this week are a case in point.

The first is called Atlas Search, which is an integrated Lucene text search engine that can chew on the documents stored in the MongoDB database directly. This is a whole lot easier than moving data into a Lucene cluster (or another search engine that is external to MongoDB) to make it searchable. Another new feature debuting on the MongoDB cloud service operated by the company is called Atlas Data Lake, and it allows for the MongoDB Query Language, or MQL, an SQL-like query language that is proprietary to MongoDB, to query any data that is stored in an object storage system that is compatible with the S3 protocol from AWS to query that data just like it was pulled into the document database. The company is also previewing Atlas Online Archive, which will automagically stage cold data in MongoDB to S3 storage to minimize the size of the database clusters running the hot documents. Federated queries will work across the hot and cold storage using the Atlas Data Lake feature. New features such as Auto-Scale, which as the name suggests automatically provisions new compute and storage for MongoDB clusters, are now also available, and so is a database schema wizard to help companies better tune their document databases.

MongoDB Enterprise 4.4, the latest release of the packaged software stack, is also out this week, and it has a number of features, which Kreuter says were driven by MongoDB users, to improve performance. One of the neat ones is that the database now allows users to redefine their database sharding keys, thus modifying the way data is distributed across the cluster as they need to change it. Hedged reads is also an interesting new feature, which allows for queries to submit multiple read requests over database shards replicated on nodes and get the fastest answer instead of just submitting a request to a single node and waiting for it to answer. You can check out all the new features here.

The important thing is not to get the wrong idea. MongoDB and its customers very much believe in a hybrid world, and without them confirming this, we think there are a large number of its customers who are running both the Atlas service and the MongoDB Enterprise database either on a public cloud (even AWS) or on their own gear in their own datacenters.

“We stay very close to our on-premises customers – and that term is loose and something of a misnomer – to see what they are interested in and what capabilities they want locally,” Kreuter tells The Next Platform. “But we have found over the past few years that by launching new features in the cloud first, we get a ton of valuable feedback and it is a lot easier for people to adopt a feature in a public cloud context than it is for a traditional company running software on premises.”

But some features, like the Kubernetes operations manager integration with MongoDB, are going to start out in the on premises offering. Although Kreuter adds quickly that there is a way to make that Kubernetes operator span from on premises up into the public clouds.

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.

Subscribe now

1 Comment

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.