BSC Builds 21st Century HPC In A 19th Century Cathedral
October 23, 2017 Rob Johnson
This summer, the Partnership for Advanced Computing in Europe (PRACE) added to its roster another of the world’s most powerful high performance computing systems. The Barcelona Computing Center’s new MareNostrum 4, delivered by IBM with the help of partners Lenovo and Fujitsu, and fueled by HPC technologies from Intel, will facilitate extensive engineering and scientific research in fields like astrophysics, weather forecasting, and genome research. Nestled within a unique building – the Torre Girona chapel, which fell out of use – the fourth generation MareNostrum system relies on a general purpose cluster working with three specialized clusters to achieve its overall performance capacity of 13.7 petaflops.
In addition to the MareNostrum 4’s contributions to other PRACE HPC systems located in Italy, France, Germany and Switzerland, the MareNostrum 4 supports the Spanish Unique Scientific and Technical Infrastructure network as part of the Spanish Supercomputing Network.
As a government-funded project, the MareNostrum 4 offers HPC access for innovative researchers, government agencies, and others across Spain who require the greatest supercomputing power for demanding scientific endeavors. Demonstrating their commitment to PRACE, the importance of enabling researchers, and vision for the future, in 2015 the Spanish Council of Ministers authorized a budget exceeding €34 million in anticipation of the MareNostrum 4 planning and implementation. The budget accommodates not only the latest technology inside the racks of computing systems, but also the particular electrical and cooling requirements, and the parallel disc system necessary to support such an expansive HPC implementation. Wherever possible, the MareNostrum 3 components will be re-purposed into the new configuration for budget efficiency. Since July 2017, the MareNostrum 4 is made available for priority science and research projects, free of charge. The system will help promote science across Europe, and each project is subject to peer review encouraging both collaboration and the best possible results.
Today, Torre Girona Chapel’s classic architecture, including the original pillars and arched ceiling designs, remains intact. However, its infrastructure is reconfigured to accommodate the requirements of the world-class HPC system under its roof. Asked about the challenges of housing one of the world’s most powerful HPC systems in a historic building, Sergi Girona, BSC’s operations director described the endeavor. “The MareNostrum supercomputing infrastructure requires 160 square meters, with racks of computing nodes stacked two meters high. The beautiful chapel housing it was originally designed to help move people in and out of the building, not to contain an enormous supercomputing system. We enjoyed the creative challenge of technical retrofitting to meet power, cooling, and space requirements while preserving the historic structure.”
Because the BSC team and the MareNostrum 4 system will host varying engineering and science-based endeavors, the technical infrastructure must be generalized and adaptable for diverse project types. BSC relies on the latest technologies to meet that goal and plans future versions of MareNostrum. “It is critical that our vendors understand the importance of our work, our requirements, and the urgency under which we operate. Speed is everything in our business,” notes Girona. “It is also important for them to provide us briefings about upcoming technologies so we can better prepare and plan for implementation in the next-generation MareNostrum system.”
By using the newest technologies, such as the Xeon SP processors and Omni-Path interconnect, BSC can offer its clients the flexibility for unique assignments. To achieve an overall performance capacity of 13.7 petaflops, more than twelve times that of the MareNostrum 3, the new system relies on a general purpose cluster working in tandem with three specialized groupings.
Delivered by Lenovo, the general purpose element delivers 11 petaflops through the combination of 48 racks of 3,456 nodes with Intel Xeon SPs. To meet the specific requirements of BSC, a technology combination never used in a supercomputing system before required development. The configuration involved the combination of 48-port Omni-Path edge switches, a dual hot-swap power supply unit, and storage solutions from IBM. With these elements in place, the general purpose cluster of the MareNostum system can take full advantage of the Omni-Path fabric to facilitate maximum throughput. Each of the 3,456 nodes utilizes several Intel technologies including a pair of Xeon SP processors (Platinum 8160s, to be precise), 96 GB of memory, a 240 GB S3520 SSD flash drive, and a 100 Gb/s Omni-Path host fabric interface. The interconnect topology is comprised of six Omni-Path director class switches, and 144 Omni-Path edge switches.
The three specialized clusters deliver the remaining throughput, requiring integration of many components supplied by key partners. Each of these cluster configurations has a proven track record of success at other locations meeting the requirements of high-demand supercomputing scenarios:
- Like those components proposed for the Summit and Sierra supercomputers at Oak Ridge National Laboratory Lawrence Livermore National Laboratory, the first cluster relies on IBM Power9 processors, Nvidia Tesla GPU accelerators, and Mellanox InfiniBand fabric.
- A second cluster, delivered by Lenovo, will depend on current and future generation of Intel’s Xeon Phi processors, like those specified for the Theta and the Aurora supercomputers at Argonne National Laboratory.
- The third cluster, incorporating technologies from Japan’s Post-K supercomputer, depends on ARMv8 processors supplied by Fujitsu.
Girona describes the hurdles faced by his team: “At all times we must maintain the latest and greatest platform for our clients who depend on the MareNostrum’s capabilities. We support a different compute-intense scientific project every four to six months, and with increasing HPC system speed, we can help our clients move more quickly from project planning to gathering critical data, to peer review. Since there are so many important scientific projects needing supercomputing support, want to offer HPC access to as many innovators as possible. A faster system means a shorter wait for their time for the next project in line for the MareNostrum. We are proud that our team has an important role enabling advanced scientific knowledge in Europe.”
The BSC team takes seriously not only the technical prowess of the MareNostrum 4 system, but also its aesthetics. As with previous versions of the MareNostrum, BSC will continue its commitment to an open-door policy, offering public access to anyone interested in seeing and learning about the new system. A specially designed glass enclosure surrounds the MareNostrum, allowing visitors an up-close look at one of Europe’s most powerful systems. As noted by Girona, “Our team placed careful attention to every detail of the MareNostrum’s implementation. The computing system’s architecture and optimization are critical for ideal performance. However, our team also takes the little things seriously – like the blue Ethernet cables which match the chapel’s interior. BSC receives over 10,000 guests each year. That number includes not only scientists, engineers, and government officials, but also students and families who are simply curious about supercomputing.” With a chuckle, he elaborates, “Everyone is welcome here, and who does not want his or her house to look perfect for guests?”
The November 2016 Top 500 supercomputer ranking placed the MareNostrum 3 system at number 129 with nearly 49,000 computing cores delivering over a petaflop at peak performance. Now in production, BSC’s next generation MareNostrum 4 system moved up to number 13 among Top 500’s June 2017 list. Asked about performance expectations for the MareNostrum 4, Girona grins. “We are pleased with the performance today, and we are always eager to explore additional ways we can to optimize the system to eek every bit of performance from it.”
“The BSC team is committed to maximizing the MareNostrum in any ways we can,” says Girona. “But the MareNostrum is not about us. Our purpose at BSC is helping others. BSC is successful when the scientists and engineers borrowing the MareNostrum system’s computing power get all the data they need to further their discoveries. We want them to keep coming back to us with other new projects. We are always enthusiastic about BSC’s role assisting breakthrough endeavors, and it is always rewarding to know we help others to further cutting-edge scientific exploration.”
Rob Johnson spent much of his professional career consulting for a Fortune 25 technology company. Currently, Rob owns Fine Tuning, LLC, a strategic marketing and communications consulting company based in Portland, Oregon. As a technology, audio, and gadget enthusiast his entire life, Rob also writes for TONEAudio Magazine, reviewing high-end home audio equipment.