Bridging Object Storage And NAS In The Enterprise
December 7, 2017 Jeffrey Burt
Object storage may not have been born in the cloud, but it was the major public cloud providers like Amazon Web Services, Microsoft Azure, and Google Cloud Platform that have been its biggest drivers.
The idea of object storage wasn’t new; it had been around for about two decades. But as the cloud service providers began building out their datacenters and platforms more than a decade ago, they were faced with the need to find a storage architecture that could scale to meet the demands brought on by the massive amounts of data being created, and as well as the need to be able to move the data more easily over the Internet.
The key to the latter issue was the development of the S3 protocol by AWS, which isn’t to be confused with the cloud company’s Simple Storage Service (S3) data storage platform, which was the among the first services AWS rolled out when it officially launched in 2006. The S3 protocol that the S3 service relies upon was the solution that made moving data back and forth between the cloud and datacenter easier. Other companies have adopted the S3 protocol and it has become a standard interface to object storage, moreso than the Swift alternative that came out of the OpenStack community through Rackspace Hosting.
There are a lot of reasons to like object storage. Such systems can scale easily and quickly to thousands of petabytes and can cost up to 20 percent less to manage. With object storage systems, data is managed as objects rather than file systems or blocks, with the objects including the data, metadata that make querying and analyzing the data easier, and a globally unique identifier, which helps drive the scalability of the technology.
The first object storage system appeared from EMC more than a decade ago, but it was the S3 protocol that helped fuel adoption of the technology, according to Jon Toor, chief marketing officer at Cloudian, a datacenter storage vendor that has based its object storage solutions on the S3 standard.
The protocol “is one of the two key ingredients you have to have to make object storage take off,” Toor tells The Next Platform. “You have to have a compelling need – in this case, it was capacity – and you have to have a standard, which in this case was S3. Those two things came together and they’ve really driven the accelerated growth of object storage. We’re seeing unstructured data just grow beyond the limits of what conventional [storage] can handle. There are a lot of different use cases where people need more scalable, more cost-effective solutions. Object storage is ideal for that.”
Cloudian is one of a number of vendors large and small looking to drive enterprise adoption of object storage. Others include DataDirect Networks with its WOS portfolio, Dell EMC and its Elastic Cloud Storage (ECS), Scality, and Exablox. Cloudian has about 140 customers, with 40 percent in the service provider space, including Interoute and Schuberg Philis, while the other 60 percent are in such industries as media and entertainment, healthcare, and retail.
Many of these enterprises face the same challenges as the largest service providers – a rapid increase in the amount of structured and unstructured data that is being created, and the need to find cost-effective and scalable ways to store and manage it.
“We took the same scale-out technology that you would find in the cloud, whether it was Google, Amazon, or Microsoft, and deployed it as something people can use in their datacenter,” Toor says, noting that the company sells its object storage technology both in an appliance, HyperStore, and as software that can run on industry-standard servers. HyperStore packs as much as 840 TB of capacity in a 4U rack enclosure, and its modular design means that capacity can scale by adding more appliances into the environment. Such easy scalability is important at a time when data for mainstream companies can easily pass the petabyte level. “In enterprise storage, a petabyte seemed really big. Now a petabyte is something people can fill with 100 hours of high-definition content, so it’s not as big as it used to be.”
One of the challenges of growing object storage use in enterprise datacenters is that there are relatively few applications that leverage the S3 storage protocol. There are some applications in such industries as healthcare and media, but they are not as NFS, Windows and Linux files, which Toor says is still the dominant way unstructured data is stored. Cloudian this week rolled out HyperScale, a system designed to more easily bring files into the cloud data storage world.
“Our system is a cluster of boxes that speak S3,” he says. “It’s a storage protocol like SCSI and Fibre Channel are. Like anything else, it has commands, then you send it blocks of information, a storage file system, just like anything else. If you look at a NAS box like from NetApp, they have a bunch of Fibre Channel drives on the drive shelves, and on top they have the NAS head; it’s a server that speaks to NFS and Windows and Unix file servers to the outside world. Within the storage environment, it speaks Fibre Channel to the drives. It’s acting an interpreter and a file manager in between these two worlds.”
HyperFile speaks NAS like any file server does to clients, “but through the back end storage, instead of using Fibre Channel drives, it’s use S3-connected storage. It forms the exact same thing as a NetApp or [Dell EMC] Isilon as head would do in that system. The only difference is, instead of talking Fibre Channel to the back-end storage, we talk S3 to the back-end storage. Other than that, the functionality is identical.”
It includes such NAS features as support for CIFS and NFS, non-disruptive failover, POSIX compliance, Active Directory integration and snapshot capabilities. It also includes a data migration engine for transferring files from NAS systems to HyperFile. It comes in two versions with different levels of capabilities and can save enterprises as much as 70 percent over traditional NAS systems, according to Cloudian. Pricing starts at a half-cent per gigabyte per month.
“It talks S3 to the storage, but when your store a file using HyperFile, that data can then be accessed by any S3 application as well,” says Toor. “You can talk to our platform using an S3 application, you can talk to our platform using a file-based application, and you can access the same data. It’s completely interchangeable between those two worlds. What that does for people is now they’ve got an S3 object that has file data in it that you can read from S3 applications or from file applications. We can then move that S3 object into the cloud if you want, and that data is still readable by any application in the cloud. You can store a file in your datacenter, it goes to our Cloudian environment, you can migrate that object to the cloud and then you can still read it from any cloud application. What this does is provide data portability – the ability to move data around between environments, which is the next wave we’re seeing from people. They got very excited about the cloud, they found out what the cloud is good for, what it’s not so good for. Everyone recognizes now that the world is going to exist in both places – on-prem and in the cloud.”
The object storage solution isn’t for every workload. Toor notes that object storage is highly scalable and cost-efficient, but isn’t as fast as traditional storage, so latency-sensitive applications like databases and transaction processing, and those using block storage wont’ work well.
“It’s the latency, and the time it takes to find a file is longer with object storage,” he says. “It can longer by a factor of 100 milliseconds, or 200 milliseconds, so you could be looking at a half-second delay to find a file potentially. That right there says the application user is a guy who is moving large files and who is not so concerned about latency.”
Among the businesses adopting the HyperFile solution are Element Fleet, a GM fleet operations spinoff that manages about 40 million vehicles, and Satellite Applications Catapult, which manages satellite data.