The Impending AMD Milan Versus Intel Ice Lake Server Showdown

What a strange server CPU world we live in. The dozen or so biggest customers in the world command something on the order of 45 percent of the server CPU shipments, but significantly lower share of the revenue because of the volume discounts they can command, and they not only shape the product rollouts, their opinions can kill off processor SKUs long before we even know about them on announcement day.

In fact, whatever epic battle we might see between AMD’s “Milan” Epycs and Intel’s “Ice Lake” Xeon SPs has already happened months ago, and we are several months away from the formal launches of these two processor lines, each in the third generation of their families and each launched in the shadow of a global pandemic and a much-diminished Moore’s Law that makes advances in process technology, and therefore design, more challenging than we have ever seen before.

It is also strange, but increasingly common, that we see AMD and Intel making statements about their server processors at the Consumer Electronics Show, but this is the first big IT event of the year and it gives both companies a chance to raise the curtain a little bit more on what they are up to on the server front. We take the data, such as it is, where we can get it.

We have covered the delays in Intel’s 10 nanometer processes sufficiently in the past, and the consequent rejiggering of the Xeon SP processor roadmap, and are not going to drag that all out today. Intel has a new core architecture called “Sunny Cove” that is finally being brought to market, and as we reported back in August, it will have at least 28 cores. As we said back then, we believe that Ice Lake was always going to top out at 28 cores and that there in an extra UltraPath Interconnect (UPI) port on the die to allow for two processors, and therefore 56 cores, to be crammed into one socket, and an extra PCI-Express controller as well to balance out the compute and I/O better.

In that sense, the “Cascade Lake-AP” Xeon SP 9200 series processors that were announced in late 2019 were a prototype for what we suspect will be an Ice Lake-AP. And now that we think on it, we think that there is a good chance that Intel will also create 10-core LCC variants and 18-core HCC variants of Ice Lake to complement the 28-core XCC variants and allow dual-chip modules (DCMs) to be created from these. And if not, the question we will have right off the bat is: Why not? Aside from the performance improvements that will come with the new Sunny Cove core and the shrink to 10 nanometers (allowing for higher clock speeds if the core counts are the same as with Skylake, Cascade Lake, and Cooper Lake Xeon SPs), Intel is also rolling out new security extensions in the processors, which we talked about in detail last October.

Intel didn’t say much about Ice Lake Xeons during its keynote at CES 2021, mind you. In a keynote address spanning all kinds of processor announcements, Gregory Bryant, who is executive view president and general manager of the Client Computing Group at Intel, said that volume production of the Ice Lake Xeon SPs had begun and was ramping throughout the first quarter of this year. Bryant said that Ice Lake “delivers significant increases in core count, performance, integrated AI, and security features across a wide variety of workloads,” and we think that extra performance is going to come in the way we have outlined above. Some from the Sunny Cove core and some from DCMs. Bryant added that Intel would be talking about Ice Lake Xeon SPs “in the coming months.”

That sounds to us like Ice Lake will be launched in March, when many Xeons have been launched in the past. But, the initial Cascade Lake Xeon SPs came in April 2019 and the tweaked Cascade Lake Xeon SPs came in February 2020 as AMD was getting traction with Rome Epycs. Intel’s Ice Lake launch timing will depends on when it believes AMD will launch the 64-core Milan Epyc server chips. And vice versa.

Lisa Su, president and chief executive officer at AMD, gave a keynote at CES 2021 as well, and spent most of the time talking about client devices, as you might expect. But at the end, Su tossed in a few details on the upcoming Milan Epyc 7003 server CPUs, based on the Zen 3 cores already deployed in the Ryzen client processors and packing up to 64 cores in a socket like the current “Rome” Epyc 7002s based on the Zen 2 cores, would “reset the bar for datacenter computing” and “extend AMD’s leadership in performance, total cost of ownership, and security.”

To give a sense of how the Milan chips will stack up, Su pulled up some benchmark test results pitting the Cascade Lake Xeon SPs against the Milan chips running the Weather Research and Forecasting (WRF) climate and weather simulation code created in the late 1990s by the National Center for Atmospheric Research, the National Oceanic and Atmospheric Administration, the Air Force Weather Agency plus a number of other government agencies and the University of Oklahoma – a state with a lot of tornados. The model is used in over 185 countries around the global for weather prediction and is a heavy workload indeed.

The test that AMD did ran on a single two-socket server, which is not really how WRF is run in production, and specifically was doing a weather forecast for the continental United States. A “huge set of weather data,” as she put it, is pulled into these machines to produce a six-hour weather forecast. If you look carefully at Su’s presentation, it is run at 2.5 kilometer resolution, which is pretty fine grained; only a few years ago, simulators were running at between 9 kilometers and 13 kilometers in production.

AMD pit a machine with a pair of 28-core Xeon SP-6258R Gold processors, which have clocks running at 2.7 GHz and which cost $3,950 a pop when bought in 1,000-unit trays from Intel. This is a 205 watt part and it is the top bin in the revised Cascade Lake lineup from last February.

AMD did not specify what the clock speed or thermals were for the pair of Milan Epyc 7003 series chips used in the WRF test, but did use 32-core variants to make a more apples-to-apples comparison. What we do know is that the AMD machine did the six-hour forecast in an unspecified amount of less time, which Su said equates to 68 percent higher performance. (For this number to be true, the simulation had to run 3.2X faster on the Epyc machine than on the Xeon SP machine. If we were giving this presentation, we would have cited this wall time for each run of the simulation.) Because time to forecast and high resolution are the whole point of weather forecasting, with that extra compute time, weather forecasters can run larger models, run finer-grained models, some mix of the two, or do more model runs with slightly different initial conditions to do more ensembles of model runs.

We suspect that AMD could have gotten close to the same performance on WRF with a single-socket server with a 64-core Milan server, but the clocks would have probably been slower and the delta would not be as big as running that simulation 68 percent faster. But the cost per unit of compute might be lower. And two 64-core Milan chips, as Su pointed out, will beat the tar out of a pair of Cascade Lake Gold chips. It is hard to say where a pair of Ice Lake SCMs would come in, or a pair of Ice Lake DCMs might stack up to a pair of top-bin Milan chips.

This will all be more clear once the Ice Lake and Milan chips are announced and we can drill down into the feeds, speeds, slots, watts, and bucks. We look forward to it. Eagerly. And we fully expect that AMD will compete well on performance and absolutely lower the boom on pricing.

AWS
Vendor Voice - High Performance Computing on Amazon Web Services

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

12 Comments

  1. Intel presented more Ice Lake Server info back at HotChips 2020. They moved up to 8 vs 6 channels of DDR4, and it is higher speed than previously. They moved from PCIE3 to PCIE4. They moved from 14nm to 10nm to reduce power.

    Looks like they are saving the big changes for Sapphire Rapids … 10ESF process, Golden Cove cores, DDR5, PCIE5/CXL, AMX tiled matrix operations. I don’t believe Milan has any of these features, and still no avx512, dlboost, optane DIMM support on Milan chips.

    Intel says they have been sampling Sapphire Rapids since q4 of 2020.

    • Sure but Milan and Sapphire Rapids aren’t a proper comparison. Milan doesn’t have any of those features but Genoa has most of them. Sapphire Rapids and Genoa should launch in around the same timeframe, just like Ice Lake and Milan are.

      • Exactly. The AMD and Intel roadmaps are not 100% aligned timewise but AMD Milan is contemporary to Intel Ice Lake and AMD Genoa will lineup against Intel Sapphire Rapids. Genoa and Sapphire Rapids will be broadly similar in platform support for DDR5 and PCIe5. There might be some detail differences with in-built support for AI/ML instruction sets vs. denser compute capacity that Genoa will have. One could argue that using up large amounts of transistor budget on extended instruction sets would be better spent on denser compute and being able to efficiently run AI/ML workloads on dedicated accelerators. A lot of avx512 and dlboost transistors are going to spend their time unused for large amounts of workloads. Eventually the data center is going to migrate to disaggregated compute, memory and storage so the on chip AI/ML and local Optane may only be a temporary waypoint.

        • “One could argue that using up large amounts of transistor budget on extended instruction sets would be better spent on denser compute and being able to efficiently run AI/ML workloads on dedicated accelerators.”

          The prior Xeons did not implement pcie4. With the PCIE4 enhancement in Ice Lake, I don’t see a disadvantage vs Milan in running workloads on dedicated accelerators for the recent designs with GPUs on the same board as the CPUs.

          The Ice Lake Servers do have the advantage of Optane DIMM support. I’ve seen close access to this much memory reported as a big advantage for in-memory database designs. I see this as the main feature distinguishing Milan and Ice Lake Server, since PCIE4, support for 8 memory channels and 14nm high power differences have been eliminated.

  2. If the AMD machine did the six-hour forecast in 68 percent less time, that would make it not 68% faster but 1/(1-0.68), or 310% faster.
    I believe that 68% faster is the correct figure, in which case it did it in 1-(1/1.68), or 40% less time.

  3. “What we do know is that the AMD machine did the six-hour forecast in 68 percent less time, which equates to 68 percent higher performance”

    I have to disagree here. 68% less time equates to 215% higher performance (== 3.15 times the performance). Think about it – the job was completed in roughly 1/3 (32%) of the time. If it was to mean 68% higher performance, then it should have finished in 40.5% less time.

    • Yup. Just moving too fast. The issue is they did not give the wall time even though that was what they were measuring and then just flashed up a chart that said “68% higher performance.”

      The whole thing as shown took less than a minute, but the simulation could have been sped up for the presentation. If it was 60 seconds, then the other machine would have completed in around 19 seconds, and that is 3.2X faster.

  4. Your assumption about the speed are wrong , if a car goes 100kmh and another is 10%faster goes to 110kmh , their statement was about speed 60%faster means 1.6x which also make sense compared to core counts and zen 3 cores performances vs cascade lake

  5. Now that both Milan and Ice Lake are available, wait to see a true head to head benchmark comparison. Any idea when we will see the data?

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.