Supercomputing Strategy Shifts in a World Without BlueGene
April 14, 2015 Nicole Hemsoth
Last week when we broke the news about the upcoming 180-plus petaflops Aurora supercomputer set to be installed at Argonne National Lab in 2018, we spent our time scrambling for whatever feeds and speeds we could muster.
In the process, a sidebar news item about another system at the same site, an 8 petaflops future generation Cray XC (we assume not an XC40, but rather “Shasta”) machine outfitted with next-generation Knights Landing chips scheduled for 2016 delivery, emerged. This will serve as a large testbed cluster in advance of Aurora, in part to help the Argonne teams make an architectural leap out of their comfort zone, and by necessity. Now that IBM has formally, but quietly, moved on from its massively parallel BlueGene system, it leaves IBM-centric labs like Argonne in the cold—and after that sort of abrupt vendor exit, it stands to reason that they would look beyond Big Blue to support their new supercomputing ambitions.
Argonne has been a notorious consumer and supporter of IBM BlueGene systems since the very beginning, starting with the formation of the BlueGene Consortium with IBM in 2004, all the way through successive generations of the system through Mira, a leading-class BlueGene/Q that went into production in 2013. And as one might imagine, it’s not easy for the teams at Argonne who have worked on BlueGene machines for a large part of their career to watch the line fade into darkness in favor of the push for OpenPower.
To emphasize, the one center that was historically IBM-focused made the distinct decision to pull to a new direction via the Intel and Cray deal that was announced while the other two centers that were part of the CORAL procurements went with machines sporting the OpenPower blend of Nvidia GPUs and upcoming Power9 processors. While to be fair, this combination linked by InfiniBand networks is the logical successor to BlueGene in IBM’s view, a great deal of nit-picking and fine-toothed-combing for over a decade went into grooming for this BlueGene architecture at Argonne.
“We were very sad, for sure,” says Bill Allcock, director of operations at Argonne Leadership Computing Facility. “And so were a lot of people, including most of the BlueGene design team, who if you’ll notice, left IBM shortly after and went to a competitor. We just loved the design of the BlueGene. I still can’t know what went on inside of IBM and I’ve worked with them for a very long time, but this smacks to me of a political decision, this move to everything as Power, there can be no other explanation.”
It’s fair to assume that competitor is Intel, who now hosts former BlueGene innovators, including architect for three of the BlueGene systems, Alan Gara. Ironically, Gara was one of our primary sources for an inside look at the future Aurora system at Argonne from an Intel point of view.
With BlueGene buried, Argonne is looking to new partnerships with Cray and Intel to power their next generation of applications and internal work on tweaking large systems for energy efficiency and reliability. The early work begins with Theta before hitting full production swing with Aurora, and this means a steep learning curve for groups at Argonne that have tuned for BlueGene over the years—and who still plan on using some of the key operational tools that were optimized for IBM systems.
“The L to P to Q on the Blue Gene were evolutionary changes, they were changes, but not significant. Jumping now to the Knights series of Intel processors for Theta means an early chance for people start porting their codes to that architecture, which we hope will make moving to Aurora much easier.”
Unlike other centers, however, who are investing in new Knights architectures from existing X86 infrastructure, Argonne has more legwork to do at all stages, which we will get to a moment. But more broadly speaking, they are in for the same scary stages of pre-exascale computing that the other centers are facing. As Allcock puts it, “as we start getting tens to hundreds and then thousands eventually of cores on a single processor, this is when time is really going to tell. Now, this isn’t my problem so much as a systems software developer as much as it is an applications problem, but how will we take advantage of all these cores? That’s what we are asking.”
Allcock says that there will be a learning curve ahead from a system software perspective as they look at, for example, 72 cores, which will need to be carved up to run codes and computations as well as work to move data. “Almost certainly. the future holds NVRAM for us. So there will be cores that are doing nothing but staging in and out of NVRAM, so power is going to be a huge concern. It’s not a technical thing necessarily, but when you start thinking these machines are consuming in the teens of megawatts, you can’t really afford to run a machine much more than that. We’re going to have to start doing some crazy things on the hardware side, which is up to the vendors there with our help, but really, we need more control over what we can turn on and off. There’s a lot of work that needs to happen there.”
And this hits on a specific problem for Argonne, who again, has built much of its system software stack around BlueGene. The tools Argonne uses for an important function—resource management and scheduling—are built with hooks to let it easily tune and manipulate for BlueGene to power down and up according to needs. This was a key to Argonne’s efficiency, but it will now require extensive retuning to support a new architecture.
The middleware in question is called Cobalt, and it has been alive and kicking against BlueGene systems since the early 2000s, finding nips and tucks to create a very customizable, accessible framework for handling everything from tuning to reporting, monitoring, and overall cluster management that Allcock says they’re hard pressed to find in a commercial offering.
His teams are going to use the experimental Theta cluster (which will be a useful production machine, not just an 8 petaflops science project) to work with Cray to get the same functionality they’re used to across a new architecture. Cray already has some of the job launching and management tools built within the stack within the ALPS software, but it does not provide the flexibility in launching and tuning jobs based on specific energy efficiency and performance that Argonne has grown accustomed to. Argonne’s work with Cray during the rollout of Theta in 2016 will include a great deal of collaboration to ensure it has their knobs and buttons—and that these are all ready by the time Aurora comes online a few years later.
The deal was just announced, and although the ink has yet to dry, Allcock and his team are cooking up ways to bring the best of the BlueGene world to the future Cray and Intel systems that will play a big role at Argonne going forward. The architectural leap creates new work across the stack—from system administrators who were weaned on BlueGene to application developers who are going to have to reoptimize and in some cases, rethink how their codes can run efficiently. After all, for a lab used to running its machines at an 80 percent utilization rate, the bar is set high.
“Even though it sounds like a lot here, we don’t have to start from scratch. We will take what works and modify it as we go. Cobalt, and really the new machines, have many strong points,” Allcock concluded.