HPC environments, by their very nature, tend to be large and are usually quite complex. Whether it’s pushing the boundaries in life and physical sciences or supporting reliable engineering, it takes many computers operating together to analyze or simulate the problems at hand. The quantity of data required, and the access performance to keep all those computers busy, can only be met by a true parallel file system, one that maximizes the efficiency all storage media in a seamless, total-performance storage system.

The PanFS® parallel file system delivers the highest performance among competitive HPC storage systems at any capacity, and takes the complexity and unreliability of typical high-performance computing (HPC) storage systems off your hands, and it does so using commodity hardware at competitive price points.

PanFS orchestrates multiple storage servers into a single entity that serves your data to your compute cluster. Through sophisticated software, multiple storage servers that each have HDDs and/or SSDs attached to them will work together to provide hundreds of Gigabytes per second (GB/s) of data being read and written by your HPC applications. PanFS manages this orchestration without manual intervention, automatically recovering from any failures and continuously balancing both the load across those storage servers and scrubbing the stored data for the highest levels of data protection.

PanFS was the first storage system designed with the parallel file system architecture that is the de facto dominant storage architecture in HPC systems to this day. While the foundation for PanFS was laid over 20 years ago, the file system continues to adopt the latest technology advancements to provide the exceptionally high performance, reliability and low-touch administration our customers have come to expect and rely upon.

In this document, we’re going to take a “breadth-first” tour of the architecture of PanFS, looking at its key components then diving deep into the main benefits.