Looking Through the Windows at HPC OS Trends
January 30, 2017 Ben Cotton
High performance computing (HPC) is traditionally considered the domain of large, purpose built machines running some *nix operating system (predominantly Linux in recent years). Windows is given little, if any, consideration. Indeed, it has never accounted for even a full percent of the Top500 list. Some of this may be due to technical considerations: Linux can be custom built for optimum performance, including recompiling the kernel. It is also historically more amenable to headless administration, which is a critical factor when maintaining thousands of nodes.
But at some point does the “Windows isn’t for high-performance computing” narrative become self-fulfilling? Sites that place a machine in the Top500 are likely to continue using the same operating system on subsequent systems in order to take advantage of acquired skills and supporting infrastructure. The Top 500 list certainly provides insight into the current state and trends in the HPC field, but it does not tell the whole story. Unless the field of HPC is defined only in terms of the current Top 500 list, there must be more out there that we don’t normally see.
Microsoft has offered HPC solutions for years. Windows Compute Cluster Server (CCS) 2003 (so named because it is a derivative of Windows Server 2003, despite being released in 2006) brought a dedicated HPC offering to Microsoft’s lineup. It included MS-MPI – an implementation of the message passing interface protocol version 2 – which offered support for the high-performance interconnects of the time: Gigabit Ethernet, InfiniBand, and Myrinet.
After CCS 2003 came Windows HPC Server 2008 R2. Like CCS 2003, this was a derivative of the core Windows Server product. The “Magic Cube” system, running Windows HPC 2008, from Shangai Supercomputing Center made its Top500 debut at #11 in November 2008 and remained on the list through June 2015. So it’s clear that Microsoft had some success, even though we could not find specific adoption numbers for CCS 2003 or HPC Server 2008.
HPC Server gave way to HPC Pack, an addon that provides HPC functionality to Windows Server. This coincides with a shift toward a focus on “big compute”, which encompasses both traditional HPC as well as batch and desktop workloads. For example, the large spreadsheets that are pervasive in business areas. While Linux rules in fields like computational fluid dynamics and genomics, Windows applications are still a significant part of financial services operations like insurance and hedge funds. These workloads may not be tightly-coupled traditional HPC applications, they still represent computing on a large scale. Given this, it may not be a surprise that Azure Batch arrived a year and a half before a similar offering from Amazon Web Services.
In the end, the problem is one of applications. Many of the applications used on traditional HPC systems began as Unix programs. It will take more than an “if you build it, they will come” approach. A presentation by Purdue University staff showed that while Windows represented nearly 20% of cores in the campus HTCondor pools, Windows jobs only accounted for half of a percent.
Without a compelling reason to move, Windows may have difficulty finding a toe hold on the Top 500 list. Nonetheless, computation still happens on the platform. Microsoft’s recent efforts to embrace the Docker ecosystem may shift the landscape as well. Docker may not be a good fit for MPI workloads, but with the ability to run either Linux or Windows containers, the flexibility of Windows may be very appealing to sites with heterogeneous batch computing needs. With Microsoft investing in InfiniBand interconnects on Azure and its Azure Batch service, it’s clear that Redmond still sees a place for its flagship operating system in the HPC world.