Designing Servers In Rapidly Changing Times

When chip makers launch the latest additions of their datacenter processors, server OEMs have historically immediately or soon after followed with a rollout of the latest new or enhanced systems based on those offerings. Then they’d wait until the next big unveiling by another chip maker, and the process starts again.

However, it’s getting more difficult to follow that pattern, for a number of reasons. The most obvious, at least right now, is timing. AMD this month unveiled its third generation “Milan” Epyc 7003 processors, the latest server chips based on the ever-evolving Zen microarchitecture that has enabled the company to muscle its way back into the server space and compete with larger rival Intel. For its part, Intel in short order will soon roll out its third generation Xeon SP CPUs – codenamed “Ice Lake” – prompting another round of announcements of servers that will be armed with the new chips.

But that’s only part of the story. CPUs no doubt are still crucial components of servers, but increasingly so are accelerators like GPUs and field programmable gate arrays (FPGAs), as are features like security, cooling and management. In addition, enterprises have shifted over the past several years to look at the machines not simply for the power they offer but for how they can be used to run such advanced workloads as machine learning and data analytics and HPC.

And now compute environments like the cloud and increasingly the edge are factors when evaluating servers.

All that plays into the how system makers not only plan and develop their hardware lineups but also when they decide to roll them out. Those considerations played a role in Dell’s decision this month to unveil its complete portfolio of new and enhanced servers two days after AMD unveiled its Epyc 7003 family of new processors and just weeks before Intel’s expected rollout of its Ice Lake offerings.

“We debated quite a bit,” Ravi Pendekanti, senior vice president of server and networking product management and marketing at Dell, tells The Next Platform. “We said, ‘Should we launch a set of products with AMD? Should we then come out and do a lot of stuff with Intel?’ Then we looked back and said, ‘Wait a second, that is not helping our customers who are making purchasing decisions. They want to look at the entire portfolio.’ That’s why we changed to a portfolio update, having both Intel and AMD in the mix. It wasn’t easy across the organization, because we would bring typically in the past bring four or five products here, maybe three or four products. This meant the entire organization had to work on a pretty broad portfolio. We have never done this, honestly – 17 platforms in one go – and as we did that … we also wanted to put the lens on workloads, which is why we said that we see that AI and the machine learning with the GPU stuff that customers want. But the other thing that’s happening is the advent of things like 5G that’s coming to the telco space.”

Dell’s new portfolio includes systems with a mix of Intel and AMD chips and that touch upon the workloads and environments that are driving the rapid changes in IT. An example are the PowerEdge R6515 and R750 systems, seen below. The R6515 is powered by AMD Epyc chips and is designed to improve data processing in big data Hadoop databases by as much as 60 percent. Meanwhile, the R750 will come with Intel’s upcoming Ice Lake processors and promise to 43 percent better performance in massively parallel linear equations for compute-heavy workloads. The XE8545 server offers up to 128 Epyc cores, four Nvidia A100 GPUs and Nvidia’s vPGU software to accelerate AI and similar workloads as well as deliver security and management benefits. The 4U rack system is the foundation for Dell’s HPC Ready Solution for AI and Data Analytics. The 2U dual-socket R750xa, which will be powered by the Intel’s Ice Lake chips, also offers deep GPU capabilities – support up to four double-wide GPUs and six single-wide GPUs – for machine learning, inference and AI and supports Nvidia’s AI Enterprise, a suite of AI tools and frameworks launched earlier this month.

Pendekanti says the rapidly changing CPU picture – not only with Intel and AMD, but also Arm and its manufacturing partners – is something Dell and other OEMs have to take into account when developing its server roadmaps. For one, the timeline for new processors is accelerating. Where once chip makers came out with new offerings every two-plus years, the time between new generations is shrinking. AMD’s first-generation Eypc chips launched in 2017, with the second generation rolling out in 2019. Secondly, just the number of chip makers needs to be addressed in server development, such as in the x86 space.

“Until a few years ago, we didn’t have to worry about [other chip makers]. It was only Intel. There was a hiatus that AMD took and now we have just for the CPU this interesting bifurcation, so we have to look at two of them,” he says, adding that other technologies and concerns – management, security and cooling – are coming into play. “It’s not trivial because the technology adoption [trends] are changing [and] the number of technologies that are coming out. If I fast-forward to a couple of years, we’ll have to think about things like smartNICs, which are coming out.”

Power efficiency continued to be a focus in the latest PowerEdge systems. Dell is including ducted fans and adaptive cooling that can improve efficiency by up to 60 percent over the previous generation, as well as multi-vector cooling, which automatically directs airflow to the hottest parts of the server. Some servers also offer Direct Liquid Cooling, which includes technology that can detect leaks.

Security also was a focus. With the cloud and now the edge, data and applications increasingly are being generated and accessed outside of the datacenter, making them more vulnerable to hacking. Intel, AMD and Arm – which Nvidia is looking to buy for $40 billion – all are putting more security feature into their silicon and designs. Such hardware-based security also is key for system makers, according to David Schmidt, senior director of server product management at Dell. It’s important to balance hardware- and software-based security.

“You can’t have one without the other,” Schmidt tells The Next Platform. “You really have to go down to the silicon-based root of trust. What we’ve been doing in the past couple of years around security, it starts in the hardware. It allows you to chain that root of trust all the way up into an operating system. … That’s exactly where embedded hardware-based security comes into play, when it is time to secure OS all the way down to the hardware. It has been a huge focus of ours to make sure it’s there. Then you start being able to do some really cool things like security on the verification, which goes up into supply-chain type use cases. You can do active software scanning, you can do a greater chip security, which you’ll see both on Intel and AMD offerings.”

Dell offers what it calls a cyber-resilient architecture and silicon root of trust. With the new systems, the vendor includes Secured Component Verification is an extension of its Secure Supply Chain assurance process, ensuring that the systems delivered to enterprises are exactly as they were manufactured, without any interference during delivery. The PowerEdge UEFI Secure Boot Customization enables boot security to be more closely managed to mitigate attacks.

Modern server planning also has to take into account the cloud and the edge. Hyperscalers and major cloud providers continue to be a driving force in the hardware space. According to Synergy Research Group, in 2020, global spending on cloud infrastructure services – including infrastructure-as-a-service (IaaS), platform-as-a-service (PaaS) and hosted private clouds – grew 35 percent, to almost $130 billion, helped in part by the COVID-19 pandemic. Meanwhile, enterprise spending on datacenter hardware and software fell 6 percent, to less than $90 billion. In addition, capital expenditures in the first three quarters of 2020 by hyperscalers hit $99 billion, up 16 percent year-over-year, with the capex aimed specifically at datacenters up 18 percent.

Dell has drawn from what the cloud providers are looking for, Pendekanti says.

“We absolutely have learned quite a bit because we do work and provide product to some of the CSPs and it’s helped us in a few ways,” he says. “One of the things that most of these cloud service providers look for [is management], because most of these guys don’t have tens of servers, they literally have tens of thousands of servers and hundreds of thousands of servers, which essentially means we need to make sure that we are able to provide the right management tools Those lessons have played a huge role into how we have morphed our whole portfolio in terms of manageability. Number two: When you look at the CSPs, they do actually look at architectures. They’re looking at faster deployment, for example, or they’re looking at how to deal with … security.”

Dell last year also announced Project Apex, which is aimed at offering its product portfolio as a service, spanning products, consumption models and cloud strategies.

Dell and other OEMs also have to address the edge, which is expanding with the growth of the Internet of Things (IoT) and will accelerate as 5G connectivity comes to the fore. In its updated portfolio, Dell is including the PowerEdge XR11 and XR12 ruggedized systems, which will be powered by Intel’s Ice Lake processors and include support for multiple accelerators. They include smaller form factors that are 400mm and 16 inches deep, hardened chasses, remote manageability and NEBS Level 3 compliance, all key points for systems that are designed to be installed closer to where the data is being generated and users are located. They are certified for telcos and military uses.

“The key thing has been most of these are ruggedized,” Pendekanti says. “On the CPU side, we don’t need as much power right now for the device sitting at the edge. We don’t really need to go out put the most platinum kind of SKU with the highest TDP. In most cases, we are looking at probably a processor SKU that is at a lower end of the TDP rating because you are literally not getting into all kinds of analytics at the edge. But you will probably do some processing, so the CPU stack that we look at doesn’t have to take the entire spectrum. Our goal would be to actually leave it at the lower end of the spectrum for these edge boxes.”

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

Be the first to comment

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.