Australia’s NCI Adds Ceph Object Storage To Lustre File Systems

Object storage has been drawing an increasing level of interest from organizations over the past several years as a convenient way to store and manage the growing quantities of data they are accumulating, especially when that may be a mix of structured and unstructured data and a lot of machine-generated telemetry.

While Amazon, with its S3 cloud storage service, can take much of the credit for popularizing object stores and their advantages of huge scalability and lower cost, not all organizations want to store data in the cloud. A number of vendors have also offered on-premises object storage systems over the years, and several of these have taken the opportunity to build solutions around open source, software-based storage platforms such as Ceph.

One such commercializer of Ceph is SoftIron, which develops its range of scale-out HyperDrive storage appliances, with claimed wire speed performance and a Storage Manager console that it says vastly simplifies the monitoring and management of all the software and storage hardware, particularly for a sprawling estate of appliances.

These claims about ease of use and the cost savings that can be made from object storage proved persuasive enough for Australia’s National Computational Infrastructure (NCI), a high performance computing and data services facility located at the Australian National University in Canberra, to adopt SoftIron’s storage appliances in order to meet some of its new storage requirements.

NCI is one of two Tier One national HPC facilities funded by the Australian federal government, with the other located in Perth, on the opposite side of the country. NCI differs from a lot of other HPC sites in the way it tightly integrates its HPC compute and storage with its cloud facilities, which provides edge-based services that don’t fit into a traditional HPC workload environment, according to Andrew Howard, NCI’s associate director of cloud services.

Australia’s fastest research supercomputer is housed at NCI, which is the 9.26 petaflops “Gadi” system. Gadi has 3,200 nodes built with Intel “Cascade Lake” Xeon SP processors and Nvidia V100 GPU accelerators, interconnected by a 200 Gb/sec HDR InfiniBand fabric using a dragonfly+ topology. NCI handles a broad range of science workloads, according to Howard, while the Pawsey supercomputing center in Perth is largely dedicated to the work that’s going on for the Square Kilometer Array radio observatory.

“We tend to deal with every other of science activity other than high energy physics,” Howard tells The Next Platform. “So we are probably one of the largest network data consumers on the planet and we consolidate a large number of international datasets so they’re available for our users, and the data can be directly computed on our HPC and our cloud systems,” he says.

The HPC compute nodes are served by a storage subsystem comprised of NetApp enterprise-class storage arrays operating a Lustre parallel file system, connected by 200 Gb/sec or 400 Gb/sec InfiniBand. However, Howard states that NCI has been looking at a number of additional and emergent use cases that are more suited to Ceph-style object storage.

“Typically, they’re the sort of use cases for making large, more static datasets available. So, that’s where we need read performance for data publication, and where we need to share data between the HPC and cloud facilities in a performant manner,” says Howard.

By moving those large static datasets to object storage, NCI can avoid tying up its HPC file system, and instead keep this free for those demanding high intensity applications. Another advantage is that an object-based interface allows the facility to support very long lived URLs pointing to the data itself, which are published in data catalogues for researchers to access.

“This is particularly important in the area of climate and weather. We store around 30 years of all satellite imagery, which is available to Australian climate researchers, the Bureau of Meteorology, to perform the weather forecasting, and ongoing climate research in areas of agriculture,” Howard says.

Overall, the storage at the NCI facility is effectively divided roughly into four areas, with the Lustre HPC file system representing the top level. The next level down comprises systems like the SoftIron Ceph storage, which provide a different sort of performance, according to Howard.

“The biggest differentiator is that Lustre is only available over InfiniBand. All of our other storage services are available over Ethernet. So we have a 100 Gb/sec Ethernet backbone across the whole facility,” he says.

Another level is more traditional volume-based storage, sitting at the same level as NCI’s cloud-based storage, while underpinning everything is a hierarchical storage system, where up to 30 years of scientific data is preserved in the archives of some disciplines, but which also serves as an ongoing backup facility for data, as users continue to generate more and more data and perform more computation on it.

“One of our typical workflows, and an area we plan on using the SoftIron equipment, is where we receive data from the ESA Sentinel satellites, perform quality analysis to ensure a coherent data capture using our NIRIN cloud, publish the data to a collection for longitudinal analysis as part of an HPC workflow then publish the analysis results back into the collection,” Howard says.

Using an S3 interface here – as almost all object storage offers these days – allows the NCI facility to separate the ongoing production data capture from its quarterly facility maintenance cycle for facility work (power, cooling, building work), he explained.

NCI also has similar workflows for data received from telescopes, genome sequencers, sensor clouds, and high resolution agriculture imagery for monitoring crop development. All of these share the need for a QA process on the data, followed by a processing stage for data augmentation (such as geo-rectification or cloud removal from satellite imagery), which allows the site to utilize the most energy efficient platform for the task, according to Howard.

When it comes to the reasons for choosing SoftIron, Howard says that it was not only the ease of use that SoftIron offers on top of Ceph, but also the modularity and the low maintenance effort that is required by the HyperDrive storage appliances.

“Typically, we have about half of one of the team members looking at SoftIron on an occasional basis, whereas the support overhead for Lustre is one to two team members that are needed on an ongoing basis. But our investment in Lustre is significantly larger, over 80 PB at the moment, so it’s kind of horses for courses,” he said.

The graphical user interface and the additional tools that SoftIron has added into its implementation of Ceph includes a single pane of glass for monitoring the health of the service, and when a disk fails, this indicates exactly which disks it is and what action needs to be taken.

This comes down to something as simple as a “Ceph button” on each drive caddy in a HyperDrive appliance, which when pressed tells it that maintenance is going to be performed on those drives. The team member can then remove the caddy, change the failed disk, replace the caddy, then press the Ceph button again to tell the appliance that the drives are available for use again.

“That really lets us assign less experienced staff members onto maintaining SoftIron, because they just get pointed in exactly the right direction in terms of what they need to do for hardware swap over and maintenance,” Howard explained.

“Anyone can put together their own Ceph cluster, and for the most part it works really, really well. But when things go wrong, just being able to find where the faults are and a really nice GUI interface to help with the maintenance, those just make life a whole lot easier in terms of being able to provide a really resilient high performance system for our users,” he added.

While the Ceph provides an S3-compatible interface, objects can also be served via an NFS interface, which could be useful for further integration with more traditional HPC applications.

“We’ve been an HPC facility for around twenty years in various guises. So NFS has been an integral part of our service offerings. We’ve got workloads that access Lustre through NFS in our cloud context, we’ve got other NFS services, some of our legacy services are still using NFS, so it still has a place as one of the most widely used storage access protocols that exist between both HPC and cloud, and the fact that SoftIron can also provide an NFS interface as well was just another tick on the box that if we need that, it’s there, we just need to turn it on,” Howard says.

NCI is initially taking delivery of enough SoftIron HyperDrive appliances to provide 12.5 PB of object storage, and these are currently in the process of being delivered and installed, once enough space is made available in the datacenter to fit it in.

“Running a datacenter is kind of like trying to shuffle deck chairs. There’s never enough power, there’s never enough space, and we typically use all the space that we’ve got. So it’s a matter of retiring out some old equipment and clearing out five racks to allow this to be installed,” Howard says.

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

Be the first to comment

Leave a Reply

Your email address will not be published.


This site uses Akismet to reduce spam. Learn how your comment data is processed.