The recent announcement by the CentOS project to discontinue mirroring Red Hat releases, which The Next Platform has already reported on, has hit some sectors hard. Those that depend on the freely available version of Red Hat Enterprise Linux are now cursing the CentOS project and looking for alternative solutions. In reality, there is no need to panic, particularly in the HPC community.
Looking forward, CentOS 8 will not matter all that much to HPC and possibly in other sectors. Distributions will become second class citizens to containers. Ultimately, all that is needed is a base operating system to run the container. Consider projects like Singularity, where everything needed to run an application is encapsulated in a secure container file that runs pretty much at bare metal speed.
Years ago, in the early days of the Warewulf Cluster Toolkit, Greg Kurtzer – the creator of CentOS, Warewulf, and Singularity – and I talked about the idea of bundling the essential/minimal operating system and libraries with applications in custom Warewulf Virtual Network File Systems (VNFS) images. The scheduler would then boot the application VNFS image on the assigned node(s) and everything would just work (no missing or outdated libraries or tools). Indeed, the Limulus Appliance clusters I have developed use open source RPM based Warewulf VNFS images and kernel bootstraps. Users can load/swap a new VNFS image using Yum and some basic Warewulf provision commands. Containers for HPC are basically the same idea, only better.
You no doubt noticed CentOS is one of the projects listed after Kurtzer’s name; he started the Community Enterprise Operating System and has since moved on to other projects. And after Red Hat has moved CentOS Streams upstream from RHEL and is stopping CentOS downstream, Kurtzer recently started the Rocky Linux project that, like CentOS, is a Red Hat rebuild. In the HPC realm, Kurtzer has also been moving forward with Singularity containers and now has been working on HPC next generation (or HPCng for short). As part of the HPCng effort, there is a new version (V4) of the open source Warewulf Toolkit under development. The new version provides containers management capabilities.
Consider an open source HPC application that is released as a Singularity (or other) container. The entire application and all needed files are “contained” in a single file. The open Singularity base layer is available for all major operating systems and provides an independence (or insulation) between the application containers and the underlying operating system. Using Singularity you can also check the signature of the container to assure the provenance of the executable code. When the scheduler runs containers, the underlying Linux OS on the nodes does not matter all that much.
Software Vendors will gladly do the same. Rather than supporting multiple distributions or picking favorites, vendors can focus on their application. Vendor HPC applications can be shipped in tested and protected in containers. Again, the scheduler runs containers not library dependent binaries. Applications just work and there are less support issues for the vendor.
In both cases, the need to maintain library version trees and software Modules goes away. Of course if you are an HPC developer writing your own application, you often need specific libraries, but not system wide. Build the application in your working directly, include any specific libraries you need in the local source tree and fold it all into a container.
As an example, building Google TensorFlow can be challenging. Many users often give up. TensorFlow is available as a Singularity container. There is no need to develop a changing OS/library/tool pedigree that matches what the TensorFlow authors used for the latest version (Which, by the way, often does not match the Linux distribution supplied versions).
By the way, Joe Landman, a familiar face in the HPC community, also had comments on this topic in his blog that are worth reading.
The bottom line is containers provide a “fluidity” to HPC that de-emphasizes the underlying operating system. As far as the future of CentOS-8 goes in HPC, it is all good, thanks, we now have Rocky, Warewulf V4, and containers, we will be moving on.
Douglas Eadline began his career as a practitioner and a chronicler of the Linux cluster HPC revolution and has grown to include big data analytics. Starting with the first Beowulf How-To document, Doug has written hundreds of articles, white papers, and instructional documents covering virtually all aspects of HPC cluster computing. Prior to starting and editing the popular ClusterMonkey.net website in 2005, he served as editor-in-chief for ClusterWorld magazine, and was senior HPC editor for Linux Magazine. Currently, he is a writer and consultant to the HPC industry.