The husband and wife team of Abdurrahman and Tülay Ateşin are experimental scientists who unexpectedly became involved with supercomputers when they moved to Texas in 2013.
The duo are chemists, collaborators and professors at the University of Texas Rio Grande Valley (UTRGV). As experimental chemists they had not explored the extensive use of high performance computing until a colleague informed them that they had free access to some of the most powerful supercomputers in the world at the nearby Texas Advanced Computing Center (TACC). That changed everything.
One of the techniques common in chemical research is the principle of serendipity. Essentially you try different combinations of approaches and materials until you find one that optimizes the reaction. You don’t have to understand how these components work – once you have the desired result, you’re done. The problem is the difficulty of building on that research to make a better product or to gain a greater fundamental understanding of the kinds of reactions and structures that are being involved. For example, photosynthesis is not well understood, making it hard to replicate in the lab these systems that work so efficiently in nature. Researchers can either fall back on serendipity or – with increasing frequency – enlist the help of supercomputers to run simulations of the components and see how they react.
Even with a supercomputer, the computations can be so numerous and complex that it is impossible to investigate them all. This, however, does not deter TACC, which continues to add new capabilities to its HPC systems. The upgrades include software as well as hardware.
For example, in addition to new, highly parallel software that takes advantage of today’s multicore high performance computers, some very useful computational chemistry programs that have been available for decades have been continuously updated and are still in use today. For example, Gaussian is a general-purpose computational chemistry software package that has been going strong since its release in 1970. Recent versions include cutting-edge research in quantum chemistry. The TACC software library contains numerous other programs that are available to researchers like the Ateşins.
TACC hardware systems are keeping pace and the chemists have been able to produce simulated results of reactions that can then be compared to experimental data derived in the lab. One of the TACC flagship systems available to the chemists is Stampede2. The system entered full production in 2017 providing HPC capabilities to thousands of researchers across the U.S. This is an 18 petaflops system that builds on the successes of the original Stampede cluster it replaced. It features 4,200 servers based on “Knights Landing” Xeon Phi processors and 1,736 servers based on “Skylake” Xeon SP processors, both from Intel. The system uses a 100 Gb/sec Intel Omni-Path network fabric to link the nodes together.
Tülay Ateşin underscores the importance of HPC to their work using the example of a small job they just finished running on Stampede2. The project was part of their investigations designed to develop a scientific method for the efficient synthesis of complex organic molecules. Usually these bioactive molecules are found in nature, but in very small amounts – not enough to conduct any meaningful clinical studies. Synthesizing these complex molecules in the laboratory can require routes that are simply too expensive to allow synthesis on any practical scale. The team ran a hundred separate compilations per day with each one lasting about a day. Tülay Ateşin notes that without the power of the Stampede2, the process would have taken more than three months and would have been prohibitively expensive.
Part of the team’s work with the HPC cluster is investigations into cascade reactions, also known as domino or tandem reactions. In chemical engineering, a cascade consists of at least two consecutive reactions such that each subsequent reaction occurs only in virtue of the chemical functionality formed in the previous step. Cascades are most commonly used in isotope separation, distillation, flotation and other separation or purification processes. Starting with two or three simple ingredients, the researchers can build highly complex structures. This requires the power of an HPC system – in fact, in one recent project the estimate to complete the optimization of a single reaction was ten to thirty years of processing time. Instead the chemists had to try and understand – one step at a time – what was happening in the complex reaction sequence. They also learned what was missing in the experiment, or what changed in the reaction if a component was added or removed.
In the not too distant past, researchers had to focus on achieving an end goal without having a fundamental understanding of the intricate reactions involved. One of the major hurdles to a more complete investigation into these reactions was the state of HPC at the time – powerful as they were, the supercomputers just weren’t powerful enough. Researchers relied on serendipity to locate materials that worked, without really knowing why they worked.
According to the Ateşins, the situation today is changing – there is more emphasis on learning how these chemical reactions work and how to make them work more effectively using computational methods in addition to the experimental methods. But, despite being able to tap into the awesome computational power of Stampede2 with its extensive software and support from TACC computer experts, the chemists admit, “We’re not there yet.”
However, recent developments at TACC may well prove invaluable to the researchers in their quest to unravel the complexities of massive amounts of cascade reactions generated in their experiments. In August 2018, the National Science Foundation announced that it had awarded $60 million to TACC for the acquisition and deployment of a next generation supercomputer.
Known as Frontera (Spanish for frontier), the petascale supercomputer will be the fastest at any US university and among the most powerful in the world. It will be about twice as powerful as Stampede2.
According to the TACC web site, the primary computing system will be provided by Dell/EMC and powered by Intel’s next generation Xeon processors code-named “Cascade Lake.” The initial configuration of the system will have 8,008 available compute nodes. Up to 80 percent of the available hours on Frontera – more than 55 million node hours each year – will be made available through the NSF Petascale Computing Resource Allocation program. Early user access is expected to begin in the late spring of 2019, with full system production anticipated by mid to late summer of 2019. The Ateşins plan to be among those making good use of the system.
“What I like about TACC in general,” says Tülay Ateşin, “is that they want to hear what our needs are as they plan for the next generations of computer hardware and software. We know that supercomputers like Frontera will play an increasingly important role in our work as we explore new complex chemical reactions that were once beyond our reach and even our imagination.”
John Kirkley has been an editor and writer in the high tech world for more than 40 years, including serving as editor of Datamation magazine for over a decade. He created Computer magazine for the IEEE Computer Society and was its first editor. He was also the senior editor of The Exascale Report and the founding editor of the Digital Manufacturing Report.