NERSC Preps for Next Generation “Cori” Supercomputer
June 13, 2016 John Kirkley
The powerful Cori supercomputer, now being readied for deployment at NERSC (The National Energy Research Scientific Computing Center), has been named in honor of Gerty Cori. Cori was a Czech-American biochemist (August 15, 1896 – October 26, 1957) who became the first American woman to be awarded the Nobel Prize.
Cori (a.k.a. NERSC-8) is the Center’s newest supercomputer. Phase 1 of the system is currently installed with Phase 2 slated to be up and running this year. Phase 1 is a Cray XC40 supercomputer based on the Intel Haswell multi-core processor with a theoretical peak performance of 1.92 petaflops/sec. It has a number of new features that are specifically targeted to optimizing data-intensive science.
With Phase 2 now being deployed, the scientists and engineers at NERSC are transitioning the Center’s supercomputer workloads to a more efficient architecture. At the heart of the system is Cray XC with over 9300 next generation Intel Xeon Phi (Knights Landing [KNL]) compute nodes; these are manycore processors with on-package high-bandwidth memory. The system will have a sustained performance that is at least ten times that of the NERSC-6 “Hopper” supercomputer.
“Now that we’re moving into Phase 2 of the Cori implementation, we face a number of opportunities and challenges,” says NERSC’s Jack Deslippe, who is leading the team that is helping prepare scientific applications to run well on Cori Phase 2. “The biggest challenge facing the NERSC staff is the same as the one facing our users – how to get user code optimized for the new system.
“If you compare the Cori Phase 2 system with our current system, Edison, you’ll see there are some pretty striking differences,” he continues. “For example, Edison with its Intel Ivy-Bridge processors has 12 cores per CPU and 24 virtual cores (from hyperthreading) per CPU. Cori, on the other hand, has several times the number of physical cores per CPU and virtual cores.
Although Cori has a much slower processor clock speed, every core on the system can do 32 double precision operations per cycle across two vector units as compared to Edison’s 8 double precision operations per cycle. Cori has approximately 1.4 GB of traditional DRAM available per core, but also has up to 16GB of faster on-package memory on the CPU, a major advantage when it comes to handling certain scientific workloads.” The on-chip memory provides up to 5x the bandwidth of the DRAM, allowing codes or routines that rely primarily on memory bandwidth rates to be accelerated.
Deslippe explains that although codes run in parallel across the cores/nodes of Hopper and Edison, many do not fully leverage the vector processing units on these machines because many codes have not targeted the level of vector parallelism available on KNL.
“Historically, due to relatively small vector units and ability to gain performance via other channels, exploiting vector parallelism has not been critical for applications seeking to obtain reduced walltimes when moving to new NERSC systems,” Deslippe says. “However, by the time you get to Cori and KNL, you are looking at a factor of 8 advantage with vector processing, which catches our users’ attention. This fact is motivating developers to look at their code at a much deeper level to make sure they are exploiting the many levels of parallelism available in the system.”
NESAP and Code Modernization
In order to prepare its users to make the most of Cori’s capabilities, in the fall of 2014 NERSC launched the NERSC Exascale Science Applications Program (NESAP). In this collaborative effort, NERSC partners with code teams as well as library and tools developers to prepare for Cori’s manycore architecture.
NESAP selected 20 projects to collaborate with NERSC, Cray and Intel by providing access to early hardware, special training, and preparation sessions with Intel and Cray staff. In addition, eight of the 20 projects will be partnered with a postdoctoral researcher who will explore the computational science issues posed by manycore systems.
The project teams, guided by NERSC, Cray and Intel, are already conducting the extensive efforts required to adapt their software to take full advantage of Cori’s KNL processors. Deslippe said that among the challenges facing the NESAP teams is the need to experiment with KNL – find out what’s on the node; learn how to handle more cores and bigger vectors; and deal with the complexities associated with the new on-package high bandwidth memory.
Regarding the latter, he commented, “In the past there was one type of memory technology accessible to the cpu cores on each processor/socket. But with KNL you are dealing with memory domains with different characteristics. Additionally, the on-package memory can be configured either as user allocatable memory or as a transparent cache. The challenge is to learn how to manage these multiple options and use them efficiently. Engineers at Intel and Cray have been helping us with this problem.”
The modernized code will be used to produce groundbreaking science that explores avenues that were never accessible before. Both the deployment of Cori and the advanced software under development represent major steps toward exascale supercomputing.
Deslippe points out that the NESAP code teams are not operating in a vacuum – NERSC is providing a wide range of resources. They include:
- A partner from NERSC’s Application Readiness team who assists with code profiling and optimization
- Access to Cray and Intel resources to help with code optimization
- Code testing, optimization, scaling and debugging on Edison
- Access to prototype Knights Landing processor hardware
- Early access and significant hours on the full Cori system
In addition, a subset of the NERSC and NESAP teams have access to KNL white boxes at Intel, allowing them to run code on preproduction models of the actual processors they will be working with on Cori.
The 20 Projects cover a wide range of scientific and engineering research – some of them falling into the category of “Grand Challenges.” Primary project categories include:
- Advanced Scientific Computing Research
- Biological and Environmental Research
- Basic Energy Sciences
- Fusion Energy Sciences
- Nuclear Physics
For example, some of the projects cover global climate modeling and mullti-scale ocean simulation. Others will explore large scale chemical simulations and 3-D geophysical modeling of the earth. For a complete list of the projects that fall into the various categories, see http://www.nersc.gov/users/computational-systems/cori/nesap/nesap-projects/.
In addition several dozen NESAP application teams will have access to NERSC training and early hardware.
Other related efforts are underway. A good example is Lawrence Berkley National Lab’s collaboration with NERSC based around the optimization of NWChem code for the Cori supercomputer. You can read more about these extensive code modification efforts at (link to callout in about LBNL, PNNL and quantum chemistry research in Linda Barney’s article.)
Deslippe says that NERSC expects Cori to be available later this year. In the meantime his team is developing and testing code on early Knights Landing test boxes. The team found that although the transition to an advanced machine like Cori can be disruptive in terms of programming – one positive outcome was that modified codes also performed better on traditional architectures.