Site icon The Next Platform

The Battle For Enterprise Compute Begins In The Cloud

If the hyperscalers are a crystal ball in which we see the far-off future of compute, storage, and networking writ large and ahead of the mainstream, then the public cloud builders are a mirror in which we see the more immediate needs and desires of enterprises.

Even within those organizations that are both hyperscaler and cloud builder, the internal facing infrastructure can be – and sometimes is – very different from the outward facing infrastructure that is sold on a metered basis. Hyperscalers can experiment and build the future for their own sakes, but the cloud builders have to create the infrastructure that companies are comfortable buying today and can move off of in a heartbeat if they are not happy.

All of this is why, to a certain extent, adoption of any technology by the major public clouds is a better indication of that technology going mainstream than if that same technology is being used internally by the hyperscalers. And this is why the rapid and enthusiastic adoption of the second generation AMD EPYC 7002 series processor (formerly codenamed “Rome”) by the top several dozen public cloud providers in the world is an important indicator of how enterprises at large are starting to rely on AMD again for compute.

Cloud builders cannot be wasting their time and money on science experiments because every new thing added to a datacenter has to be done at scale so there are enough customers to justify the investment and bring a return on that investment. The public cloud today is not “If you build it, they will come” – and has not been for more than a decade. Admittedly, when the Elastic Compute Cloud service at the fledgling compute and storage utility at the world’s largest online retailer first launched in March 2006 with a handful of instances and a dream, it was a bit like that. But today, the public cloud is a platform, and like every other platform we have ever seen, it is one that has been created to make money. And so it is more like “If you build what they already know they want, then they will pay.”

The hyperscalers are pretty secretive about how they are deploying AMD EPYC processors, just as they were when AMD was selling Opteron server chips like crazy more than a decade ago. For instance, we have seen Google’s homegrown Opteron motherboards with our own eyes from back in those days, and when the AMD 2nd Gen EPYC chips were announced back in August 2019, Google said that it was moving some of its internal workloads to these processors.

But those who operate public clouds are, of necessity, more open about what they are doing. They have to be or they couldn’t sell their product. Here’s a rundown of what the big public clouds are doing with EPYC chips:

Oracle is a good case in point, and is one of the cloud builders that is willing to talk about the details behind its decision to use the AMD 2nd Gen EPYC processors.  Oracle itself has always been unabashedly opinionated and happy to tell you what it really thinks of the competition – wherever it is and whatever it is running on.

“About two years ago, we embarked on a very successful collaboration with AMD starting with the EPYC 7001 series processor that worked as a cost-effective compute offering for our customers,” Vinay Kumar, vice president of product management for Oracle Cloud Infrastructure, tells The Next Platform. “The ‘Rome’ series gave us solid single core performance, increased memory bandwidth, and higher core count, allowing us to position AMD ‘Rome’ as our general-purpose compute. First, it gives the significant performance per core that our customers want, and second, it gives us density. But it also gives us a chance to change the conversation and use Rome as our standard compute. With the ‘Naples’ EPYC generation, that was for our cost centric use cases and the Intel ‘Cascade Lake’ was our standard.”

At that time, and still continuing today, The AMD 1st Gen EPYC chips were available on Oracle Cloud in the E2 instances for 2.5 cents per core hour with 8 GB of memory per core, and the X7 instances based on Intel Xeon SP were 6.4 cents per core hour with 16 GB per core. With 2nd Gen EPYC  and the E3 instances, Oracle’s tests showed that they delivered anywhere from 30 percent to 50 percent more performance per core on Oracle applications compared to the X7 instances and could also, importantly, beat the N5 and R5 instances at Amazon Web Services – and do so at a price of 4.9 cents per core hour with 16 GB of memory per core. Additionally, the 2nd Gen EPYC CPU-based E3 instances can deliver 128 cores and 2 TB of main memory in a single instance, which you cannot do with the Xeon SP processors today.

“The E3 instance is our new general purpose compute for both internal and external customers and it enables our new flexible shapes service, bringing customers more control and exactly what they need,” says Kumar emphatically.

Oracle is not going to switch wholesale to AMD EPYC processors, of course. Some third party and customer applications are only certified on Intel Xeon SP processors, and some customers have architectural preferences. But over time, those distinctions will probably fade and it will, we think, become a price/performance battle extraordinaire out there on all public clouds.

The other thing to consider is that AMD’s potential slice of the compute workloads it supports on the public cloud has grown between the EPYC 1st and 2nd generations, and will very likely continue to grow as new generations come out.

“If you look at all of the total addressable market for workloads out there on the public clouds and you look at, say, an Arm CPU, you can address just a little portion of the workload market that is amenable to the Arm ISA,” explains Kumaran Siva, corporate vice president of strategic business development at AMD. “If I look at where ‘Rome’ is getting used in the cloud, the workload TAM is extremely broad – our SKU stack can support probably on the order of greater than 80 percent of the overall cloud workload capabilities. So, we started out with a small percentage of workload capacity, and now we need to see what is the time and distance to get to that full level of workload support and adoption. In many cases, there is not that much optimization that you need to do specifically for 2nd Gen EPYC chips. We’ve had many customers take their code, move it over, and it just runs well right out of the box.”

As for enterprises who are unacquainted with EPYC processors and have sat on the sidelines thus far, the cloud presents an easy and inexpensive way to do proofs of concept testing to see what a move from Xeon SPs to EPYC CPUs would do for performance and price/performance. No one expects enterprises to shift all of their workloads to the cloud – some can move there, of course – but it is reasonable to use the public clouds as a testbed and speedup the whole process of making platform decisions.

Exit mobile version