Swiss Army Knife File System Cuts Through Petabytes
May 1, 2017 Jeffrey Burt
Petabytes are in the future of every company, and luckily, the future is always being invented by the IT ecosystem to handle it.
Those wrestling with tens to hundreds of petabytes of data today are constantly challenged to find the best ways to store, search and manage it all. Qumulo was founded in 2012 and came out of the chute two years ago with the idea that a software-based file system that includes built-in analytics that enables the system to increase capacity as the amount of data grows. QSFS, now called Qumulo Core, also does it all: fast with big and small files and scales like object storage, and it is a more easily managed system that fits more in line with the increasingly software-defined and digital nature of modern organizations than legacy high-end, high-performance storage arrays like Dell EMC’s Isilon, according to Qumulo. QSFS gives companies an alternative at a time when the world is moving from analog to digital, and is seeing the rise of such game-changing trends as software-as-a-service (SaaS), the Internet of Things (IoT), the cloud and big data.
“Customers are struggling at the petabyte-scale,” Jay Wampold, vice president of marketing at Qumulo, tells The Next Platform. “Companies only have a few options. The can either continue to squeeze the life out of their aging architectures, they can write their own software out of open-source options that are out there, and companies don’t want to do that or they can rewrite their applications to object-oriented, and companies don’t want to do that either.”
The vendor initially aimed its Qumulo Core network-attached storage (NAS) software at HPC organizations and the largest enterprises that were seeing the greatest need for help in managing their fast-growing distributed file systems. They included companies in such fields as media and entertainment, life sciences, technology, and oil and gas. Over the past several months, Qumulo has worked to expand its reach in the enterprise in such areas as telecommunications, retail, automotive and healthcare by adding new enterprise-level features, and now is turning its attention to the cloud. In November 2016, the company unveiled Core 2.5, which included snapshot capabilities, which enables users to more quickly recover from errors or intrusions by creating policies and taking billions of snapshots that are available through NFS and SMB. At the same time, file system layout is available via graphical representation, designed to give administrators an easy way to view such metrics as throughput, capacity, activity and IOPS heat and to anticipate and deal with problems before they occur.
In February, with version 2.6, the company introduced machine intelligent quotas for storage management that are built directly into the file system and can help administrators find and manage rogue applications and users, which given the size of these files is no easy task. The feature includes native quotas – those built into the file system, which are always in sync with the file system, reduce the amount of time needed for managing the storage and move pre-existing data and directories between quota domains more easily than in legacy systems. Intelligent quotas ensure that every quota is viewed as a policy that runs real-time queries and that can be enforced immediately. The intelligent quotas give administrators a real-time view of storage allocation and the ability to quickly find rogue users and applications and enforce the policies.
Qumulo started out on appliances, but is now allowing its file system run on third-party hardware, partnering with Hewlett Packard Enterprise to offer the Core software on HPE’s Apollo servers, the highly dense systems aimed at HPC and similar high-end workloads. Qumulo has offered its Core software on its own QC-Series scale-out hybrid storage appliances that come in 1U and 4U form factors and can scale up to 1,000 nodes. They are powered by Intel Xeon E3 and E5 chips, use up to 40 Gigabit Ethernet connectivity and provide up to 256 GB of memory and 360 TB of storage capacity. In November, Qumulo officials announced that the Core software will also be available on the HPE 2U Apollo 4200, which also can scale up to 1,000 nodes and are powered by two Xeon E5 2620 v4 eight-core processors. Storage in the HPE system included nine 480 GB solid-state drives and 18 10TB hard-disk drives. The Apollo systems with Qumulo Core are designed to manage hundreds of petabytes and tens of billions of files and objects.
It was through that partnership with HPE that Qumulo scored DreamWorks Animation – the massive animated film studio that is known for such hits as the Shrek and Kung Fu Panda movies – as a customer. Media and entertainment companies have been early adopters of the vendor’s technology, which has been used in such films as Pirates of the Caribbean, London Has Fallen, and, more recently, La La Land. Qumulo’s Wampold says the animated films created by companies like DreamWorks can create billions of files, which can involve billions of files and impact performance and manageability. On top of that, the files usually are a mix of large and small, increasing the management issues. The built-in analytics are key to easing those management issues, he says. “If you have a billion of anything, you can’t have humans managing those anymore,” Wampold says.
For DreamWorks, the myriad challenges included scalabilty and write performance for the large numbers of small files – more than 500 million for a film – visibility of data and the need for more APIs for custom integration. Core provides REST APIs that enable integration with existing workflows. The animation studio, which had used HPE and NetApp storage in the past, replaced its legacy environment with the Apollo 4200 systems armed with Qumulo Core.
Having its software to run on other vendor’s hardware also will help Qumulo make its next step forward, this one into the cloud, according to Wampold. The company over the next few months will announce several moves designed to expand Qumulo Core’s reach in private clouds and into public clouds, such as Amazon Web Services (AWS) and Microsoft Azure. Wampold – who came to Qumulo in February after more than year as director of product marketing for AWS and four years in the last decade with EMC on its Isilon product line – wouldn’t go into specifics, but says organizations with petabyte-level storage needs that want to move some of their workloads to the cloud face similar issues as those wanting to keep them in-house: a lack of viable alternatives.
“There’s not a lot of good options of scale-out file systems in the cloud,” he said, adding that AWS’ success is due in part to the ability of customers to run a wide range of workloads on the cloud infrastructure. However, organizations are reluctant to move many of the applications that involve a lot of file-based data to the cloud. Many of the applications were built a decade ago and are designed to run on legacy file systems, and organizations are reluctant to rewrite the code for the cloud. With Qumulo Core running on systems in public clouds, there won’t be a need to rewrite the code.
The move to the cloud is a natural one for Qumulo. As the company tells The Next Platform, when it launched out of stealth mode two years ago, QSFS was designed from the start for such environments. It’s a Linux application designed to run not only on the vendor’s own appliances, but also bare-metal virtualized instances running in private or public clouds.