Datera Bets on Massive Middle Ground for Block Storage at Scale
April 12, 2016 Nicole Hemsoth
For those who wonder what kind of life is left in the market for elastic block storage beyond Ceph and the luxury all-flash high rises, Datera, which emerged from stealth today with $40 million in backing and some big name users, has a tale to tell. While it will likely not end with blocks, these do form the foundation as the company looks to reel in enterprises who need more scalable performance than they might find with Ceph but aren’t looking to the high-end flash appliances either.
The question is, what might the world do with an on-premises take on elastic block storage that it couldn’t do with the numerous options out there, including using Amazon’s cloud-based EBS?
We’ll get to the performance, scalability, integration, and other matters in a moment, but from the higher view, what kicks the story into high gear is the who’s who list of folks behind the company, including investors, Andy Bechtolsheim, Pradeep Sindhu, and entities including Samsung Ventures and Khosla Venture as well as advisors Eric Baldeschwieler, co-founder of Hortonworks, VMware genius, Carl Waldspurger, Kubernetes project manager and strategic initiatives lead at Google, Martin Buhr. The Next Platform talked with Datera’s CEO, Marc Fleishmann, another known quantity in cloud and open source circles, about what shifts in the market are making room for Datera’s approach to elastic block storage on-premises and what might change in the years ahead as their strategy expands to include file and object protocols.
Fleishmann has watched for twenty years as the storage and orchestration layers have evolved, starting in the 90s at HP labs with John Wilkes, an HP fellow who went on to help build Borg (and Omega) at Google and to flesh out Kubernetes, and as co-founder of the Open Source Business Alliance (a European effort to push open source cloud tooling) and of the Open Cloud Initiative—efforts he led while serving as founder of RisingTide Systems, a high performance storage software company that just nicked the dawn of the “software defined everything” era.
On that note, he says that the term “software defined” now is missing the real point of what’s happening. “It really should be a DevOps notion of continuous deployment, but applying that concept to storage. If you look at computing now, it’s heterogeneous but everyone is trying to make it more homogeneous (converged, containers, and so on) but the irony is, that is actually just causing more heterogeneity so now we need to increase speed and scale because the very heterogeneous world of storage is still stuck in the last century.
There are a lot of companies in this boat—and when it comes to finding a way out, many see Ceph at the lower end and all-flash options at the top that are out of affordability reach unless ultra-fast performance is absolutely critical. The price at the top is a clear issue, but he says Ceph is also quite problematic for some potential customers they’ve talked to in terms of performance, scalability, and operations.
“If you look at performance on Ceph outside of the cases where people are using all-flash with a lot of tuning, but even then it’s getting between 6000 and 8000 IOPS per node. Again, we’re at 100,000 per node. A lot of our early customers are seeing that they can buy 3 nodes instead of 20.” Fleishmann says that their first customers, including a cloud service provider and EDA company, Cadence, were so surprised by the initial benchmarks they ran them again to make sure this was correct.
Either that is some magnificent hyperbole or Datera has managed to carve some interesting inroads in the block storage space. One customer says they are now able to “deliver a high performance, consistent and profitable elastic block storage service to our customers,” said Zachary Smith, CEO of Packet, a cloud infrastructure startup that provides on-demand bare-metal servers. “What makes Datera so unique is its software DNA. With Datera, we can use a true API-driven storage platform that can keep pace with our dynamic workload requirements and demanding automation needs. Datera Elastic Data Fabric self-describes and self-optimizes so we can easily and economically scale our storage service.”
“We started with block storage, so we started with high performance in mind. Ceph started with object storage where you can put block on top, but that is slow. We started the other way around and can now build in the other protocols with performance at the base.”
It is the automatic tuning that is a key feature of Datera’s approach—and one that sets it apart from Ceph has well. With Ceph, there are dozens of parameters that can be tuned, but as workloads change, keeping optimizations in place can be tricky. One of the critical bits of IP they’ve developed is an auto-tuning feature that continues to recalibrate and rebalance based on policies, affinities, and other aspects. Further, with API hooks into all common platforms—from OpenStack to Amazon EBS, to the many other management interfaces, some of the operational complexity is lessened when compared to Ceph, says Fleishmann.
As seen below in the appliance mockup, there is little complexity on the hardware side with rather basic configurations and component choices. Fleishmann says the key is in the software they’ve built, which emphasizes web-scale automation with complete programmability, policy-based configuration that is rooted in their Datera Elastic Data Fabric for auto-optimization, and the “flash-first” approach that can be seen below that deliers low latency for distributed and diverse storage types for the performance and density balance they seem to be striking.
DateraElastic Data Fabric natively integrates through iSCSI with OpenStack, CloudStack, VMware vSphere and container orchestration platforms such as Docker, Kubernetes and Mesos. The products have been available for some time but the company didn’t launch until it had a set of public use cases to point to.
Perhaps most surprising is the price Datera offered up for The Next Platform. When matched with the IOPS per node capabilities he described, a price of between 60 and 80 cents per gigabyte is worth noticing—and that is for the full hardware and software package with all bells and whistles. Users can also use Datera as software-only, but he says he expects the full suite to have the highest appeal given the price performance dance they’ve choreographed.
The 100,000 IOPS for cents per gigabyte is a big claim, but with good backing by smart industry folks, one can expect that will be a company to keep a close eye on over the next couple of years ala Pure Storage. The trick will be seeing what happens when they layer object and file protocols on top of their stack, which leverages basic storage hardware (we were unable to get any details there from Datera other than that it’s mixed storage media) and what that means for the new class of enterprise and hyperscale folks who want the Ceph-like capability to use one storage platform to tap into multiple possibilities.