Good news is continuing to gather around AMD’s second-generation “Rome” Epyc processors. The company’s newly minted Epyc 7H12 chip has established new highwater marks in four SPECrate benchmarks, as well as in High Performance Linpack (HPL). The results were reported by Atos, which ran the benchmarks on a dual-socket server of its BullSequana XH2000 supercomputing platform.
That officially makes the 64-core 7H12 AMD’s new top-of-the-line Epyc CPU. It also explains why the chipmaker is aiming the silicon squarely at HPC customers, especially those who want flat-out performance and are willing to pay the electric bills to get it. As we reported last week, the clock on the Epyc 7H12 has been cranked up to 2.6 GHz, a 15.6 percent increase over the 2.25 GHz Epyc 7742 processor. At 280 watts, it has the thermal profile of a high-end datacenter GPU and unlike them it requires liquid cooling.
The benchmark results for the dual-processor XH2000 are as follows:
- SPECrate2017_Int_base: 692
- SPECrate2017_fp_base: 528
- SPECrate2017_fr_peak: 586
- SPECrate_int_peak: 754
- HPL: 4.296 teraflops
The SPECrate results are a few percentage points above what AMD’s second-generation Epyc processor-based systems achieved in August. However, the HPL result was about 11 percent higher than that of the 2.5 GHz Epyc 7742, AMD’s previous top-of-the-line processor. Intel reports a dual-processor setup using its top-end Xeon 8180 (Skylake) processor delivers 3.238 HPL teraflops, which is about 32 percent less floppy than what the 7H12 delivered. The newer Xeon 8280 (Cascade Lake) might be somewhat speedier, but only marginally so.
Of course, SPECrate and HPL are artificial benchmarks. It’s too early to tell how well the 7H12 will perform on real applications in the field on systems that scale to hundreds or even thousands of nodes. However, according to Scott Hamilton, solutions architect for quantum and high performance computing at Atos North America, they have run WRF weather simulations models and some of the top computational fluid dynamics packages on XH2000 systems using the Epyc 7H12 and are seeing significant performance advantages compared to Intel processors on those applications as well. At least some of that, no doubt, is the result of the superior memory bandwidth of the Epyc architecture, but now that the chip also appears to hold a decided advantage in raw number-crunching ability, we’re apt to see a much wider range of HPC codes where AMD parts are able to excel.
As far as the benchmark tests go, Hamilton told us that the Epyc 7H12’s record-breaking performance was made possible by the BullSequana XH2000’s direct liquid cooling system, which was able to keep the 280 watt CPU cool and collected while running the integer and floating point hardware at full tilt. Atos offers air cooling on the XH2000 as well, but in that case, Hamilton estimates the 7H12 would sacrifice about 25 percent of its performance since the chip would automatically throttle down its clock when it got to hot.
The combination of the chip’s fast clock and the Atos system’s efficient liquid cooling enabled the 7H12 to, as Hamilton put it, “outperform everything on the market today.” The possible exception to this would be Intel’s Cascade Lake-AP (Advanced Processor), aka the Xeon 9200 series, which effectively glues two Cascade Lake-SP processor into a single package. For what it’s worth, Intel reports a dual-socket Xeon 9282 delivers 6.411 teraflops on HPL. So, there’s that.
However, as we explained back in April when the Xeon 9200 product line was launched, Intel is only selling the Xeon AP chips shrink-wrapped on its own motherboards. In addition, since these chips can dissipate as much as 400 watts and presumably have sky-high pricing to match, they really are in a different league than standard Xeon and Epyc chips.
That said, the Epyc 7H12 processor burns 30 more watts than the PCIe version (non-NVLink) of Nvidia’s V100 Tesla GPU, so it requires attentive system engineering as well if it’s to be deployed in servers destined for conventional datacenters. Which is probably why most, if not all of them, will end up in liquid-cooled systems like the XH2000.
In fact, most 7H12 processors will probably be deployed at supercomputing centers of one sort or another. Atos for example, is currently installing them in three large XH2000 systems: a 9.4 petaflops supercomputer at the French national high-performance computing organization GENCI, a 7.5 petaflops* machine at the IT Center for Science, CSC, in Finland, and a 5.9 petaflops system at the national Norwegian e-infrastructure provider Uninett Sigma2.
Those are likely to be a preview of what’s to come. Hamilton told us that a year ago, bidding AMD CPUs on a supercomputer deal was risky, since the first-generation Epyc chips were still viewed with some skepticism. As a result, most of these RFPs only referred to Intel Xeon parts. “Today what I’m seeing is that customers are open to AMD being included in the bid process and are actually calling out the Epyc processor as an option,” he said. “I would say just from that alone, AMD is gaining a foothold in the market.”
And although pricing has not been made public for the new Epyc 7H12 processors, AMD has consistently undercut Intel with regard to price/performance in its Epyc line. “For the most part, I would say we can get more performance per dollar out of the AMD processors compared to the Intel processors,” said Hamilton. “And that’s pretty much across the board right now.“
[*Editor’s note: The original 6.4 petaflops peak performance figure we cited for the Finish system has since been updated to 7.5 petaflops.]
“The company’s newly minted Epyc 7H12 chip has established new highwater marks in four SPECrate benchmarks, as well as in High Performance Linpack (HPL).”
It sure is but at the cost of water cooling instead of passive air cooling and that H as opposed to only numbers in that naming must be some sort of shorthand for High Performance Computing. And water cooling is nice for single Graphics Workstation usage as well if just to keep the fan noise to a minimum.
I’m still waiting for Apple to switch that new Mac Pro 2019 edition from Xeon to Epyc(Maybe By Zen-3 Epyc/Milan) and for there to be that Frontier like Zen-3 based Epyc/Milan(Custom Variant) with Infinity Fabric Interfacing between the CPU and GPUs(4 per processor) instead of only PICe 3/4, and that option available to a wider market than just supercomputers. That’s competition against Power9/Power10 and Nvidia’s NVLink interfaced GPU accelerator products from that Epyc/Milan(custom variant) and on Frontier. But those sorts of government sponsored contracts do see that IP getting out and used in the wider market as well.
And that’s a higher base clock than the 7742 but at a lower achievable boost clock but it’s that achievable all core clock rate that really gets the work done for HPC workloads and not some abstract limited cores boost clock numbers that have more meaning in the consumer space than for HPC.
Maybe they can run some HWRF models as well as there’s been more Hurricane activity as we move past the peak(Sept 10th) of the Atlantic hurricane season this year and some crazy bad experiences for some islands over the past 3 or 4 hurricane seasons.
The size of the system for GEnCI is 12.2 PF, 9.4 is the peak performance of the first partition of this system based on intel skl and knl processors
Thanks
Stéphane