Site icon The Next Platform

The Skills Gap For Fortran Looms Large In HPC

Back in the dawn of time, which is four decades ago in computer science and which was before technical computing went mainstream with the advent of Unix workstations and their beefy server cousins, the computer science students we knew at college had taught themselves BASIC on either TRS-80s or Commodore VICs and they went to college to learn something useful like COBOL and maybe got a smattering of C and Pascal, or occasionally even RPG, for variety. And more times than not, they learned these “real” programming languages on an IBM mainframe or minicomputer and sometimes on a DEC VAX.

The nerds all learned to program Fortran, which was two years older than COBOL, which came on the scene in 1959, and which was used to digitize the complex formulas underpinning scientific workloads. Again, usually on either an IBM mainframe or a DEC VAX. By the time we learned a little Fortran at Penn State in the mid-1980s as part of an aerospace engineering degree on an IBM 3081, the jobs were all interactive but the hallways were still lined with walls of punched card boxes as legacy program storage. They represented a very weird past that bore more resemblance to a loom than a computer, as far as we were concerned. At that very moment, the university was installing Sun Microsystems workstations in its first technical computing lab, and in came Unix and C along with Fortran. And everything began to change. Penn State never became a supercomputing powerhouse like some American universities did, but the computing that was done there was typical of the time.

Fast forward to today, if you look at the course catalog for aerospace engineering at Penn State, the only Fortran available to play with by students is one that is somehow tied to OneAPI, which makes no sense to us because OpenAPI is supposed to be data parallel C++. The programming courses in the aerospace engineering degree are dominated by C++, and C and MATLAB are also used. What happened to Fortran?

A better question might be: What is going to happen to Fortran, and that is precisely the one that has been posed in a report put together by two researchers at Los Alamos National Laboratory, which has quite a few Fortran applications that are used as part of the US Department of Energy’s stewardship of the nuclear weapons stockpile for the United States. (We covered the hardware issues relating to managing that stockpile a few weeks ago, and now we are coincidentally talking about separate but related software issues.) The researchers who have formalized and quantified the growing concerns that many in the HPC community have talked about privately concerning Fortran are Galen Shipman, a computer scientist, and Timothy Randles, the computational systems and software environment program manager for the Advanced Simulation and Computing (ASC) program of the DOE, which funds the big supercomputer projects at the major nuke labs, which also includes Sandia National Laboratories and Lawrence Livermore National Laboratory.

The report they put together, called An Evaluation Of Risks Associated With Relying On Fortran For Mission Critical Codes For The Next 15 Years, can be downloaded here. It is an interesting report, particularly in that Shipman and Randles included comments from reviewers that offered contrarian views to the ones that they held, just to give a sense that this assessment for Fortran is not necessarily universal. But from our reading, it sure looks like everyone in the HPC community that has Fortran codes has some concerns at the very least.

Employment Is Job One

Just for fun, we broadened a set of search queries on Indeed.com that they did at the beginning of the report to provide some context for the skills shortage that the DOE labs that still have legacy Fortran applications are facing four decades after Fortran was a core part of an engineering education and was not unfamiliar to computer science graduates, either.

If you go to Indeed.com and scan for Fortran jobs today, you will find 1,050 openings in the United States, Even COBOL has 1,228 jobs. This is in stark contrast to C++, which has 32,896 job openings, with C/C++ being mentioned in 15,192 openings. Java has 54,838 job openings, and Python has 83,591 openings. Only RPG, which is the analog to COBOL on the Power Systems platform running the IBM i operating system, has, at 659 openings, fewer jobs than Fortran. (And considering that there are still 120,000 companies worldwide using that IBM i platform, that says more about the constancy of the RPG programmers and their companies than it does about the strength of the market.)

Perhaps this is the case with the Fortran programmers of the world, too. But we can assure you that the RPG shops of the world are having their own skills crisis as experienced programmers start retiring – or getting sick or dying in some cases – and will do so at a growing pace in the coming years. Fortran is having no easier time, and neither is COBOL.

The good news for some HPC simulations and models, both inside of the National Nuclear Security Administration program at the DOE and in the HPC community at large, is that many large-scale physics codes have been rewritten or coded from scratch in C++, and moreover Python has become the dominant language for analysis applications – just like in the world at large. There is still a large pool of Java programmers who work on system and application programs written in that language, but any time you need performance, you usually don’t choose Java. If some Java application or framework is important enough, then it is often ported to C++ for performance reasons. (Java is too far away from the iron, even if it is in theory easier to program.)

The skills issue with Fortran is apparently not just about learning Fortran, but more about being associated with Fortran and all of the legacy baggage that has given its vintage and its low marketability going forward.

“It should be noted that training staff in the use of Fortran is not a major challenge if the staff member has sufficient experience in another programming language,” the Los Alamos researchers write. “Attracting (and retaining) staff in these large Fortran projects may prove more difficult. It is also possible that as the pool of Fortran developers continues to decrease, the demand for this skill set on legacy code bases across the industry will remain flat for quite some time, meaning increased competition for the relatively few developers with deep Fortran expertise. This has the potential to further erode retention and our ability to compete on salary.”

This is a different problem from the technical development of Fortran itself, which was explained well in the State of Fortran 2022 edition published by the IEEE last March, which you can see here. But even this report admits Fortran has its issues, and outlined them thus:

“First, the lack of a standard library, a common resource in modern programming languages, makes mundane general-purpose programming tasks difficult. Second, building and distributing Fortran software has been relatively difficult, especially for newcomers to the language. Third, Fortran does not have a community maintained compiler like Python, Rust or Julia has, that can be used for prototyping new features and is used by the community as a basis for writing tools related to the language. Finally, Fortran has not had a prominent dedicated website – an essential element for new users to discover Fortran, learn about it, and get help from other Fortran programmers. In the same way, Fortran is no longer widely taught to university students or valued as a useful skill by industry. As a consequence, adoption of new users has been stagnating, large scientific Fortran projects have been migrating to other languages, and the communities of Fortran programmers remained scattered and isolated.”

And as a consequence, as the report from the Los Alamos researchers points out, in many cases the DOE’s HPC centers may be the only customers pushing for Fortran support on future dataflow engines or other novel architectures that might come along. This is the real crusher. The sense we get from both reports is that the lack of a standard library is not so much of an issue when it comes to CPU-only parallel processing, where Fortran works well. The support for GPUs accelerators is weaker and fragmented, and the researchers called out the current “Frontier” exascale system at Oak Ridge National Laboratory and the impending “El Capitan” exascale system that is set to go into Lawrence Livermore National Laboratory, whose software ecosystem supports C and C++ applications a whole lot better than it does Fortran applications. “Multiple competing standards for Fortran-based GPU programming with varied levels of robustness and support exist today (Fortran OpenMP Target offload and OpenACC),” the researchers write. “Neither of these technologies is robustly supported on the AMD GPU (MI250) today.”

To be fair, that is why Oak Ridge and Lawrence Livermore got access to cheap flops – so they can help fix the software issues. Hence, the Exascale Computing Project in the United States is working on Flang, an open source Fortran compiler based on the LLVM compiler toolchain, which is a kind of middleware that sits between a compiler front end and a compute engine instruction waiting attentively for work at the backend. Sorcery, now a part of Siemens Software – yes, a division of that German industrial giant – is working on OpenMP targets offload and OpenACC backends for the gfortran compiler in the open source GCC compiler stack that is often affiliated with Linux. “Both efforts are largely reactionary, due to poor community and technology provider support for Fortran on advanced technologies,” Shipman and Randles say in the report.

Things often are would be our observation. The question is will the reaction be strong enough to overcome inertia. . . .

Here are the seven risks that Fortran faces in HPC as Shipman and Randles see it:

“Our assessments lead us to the view that continued use of Fortran in our mission critical codes poses unique challenges for LANL,” they say, and indeed for any other HPC center that relies on Fortran. “While Fortran will continue to be supported at some level, particularly on CPU-based systems, the outlook for advanced technology systems is dim. The ability to leverage broader and more modern open source technologies/frameworks is unlikely, increasing the cost of new physics and feature development.”

While not bleak, that assessment is also not encouraging. But, maybe it will encourage the powers that be, who control the purse strings, to invest more in Fortran tools and Fortran programmers. This will be a heck of a lot cheaper than porting all of those applications. But, in the longest of runs, maybe that will be necessary anyway as Fortran programmers age out and the pool of programmers who can – and will – take on NNSA codes and not only master them, but enhance them, shrinks.

Our observation would be this. Back in the early 1990s, it was not precisely cool to be a nuclear weapons designer. But the Comprehensive Test Ban Treaty of 1996 and the massive investment in simulation and modeling, coupled with the test data from over 1,000 nuclear bomb explosions, gave a different set of people new tools to tackle a bigger problem, and it attracted some of the smartest minds on the planet to design systems and software to take on the simulation task. Fortran needs something to make it cool again, something more than nuclear weapons, which for a lot of the youth these days is not a plus, but a minus. It certainly was for us, which is how we ended up writing The Next Platform, among other things, and coming to a kind of détente of our own that these weapons do exist and they do perhaps provide some kind of deterrent to World War III. If you want to sell it to this generation, that is how it might be done. As crazy as that might sound.

Exit mobile version