Securing The Server, Inside And Out

Timothy Prickett Morgan

6 years ago

Computing is hard enough, but the sophistication and proliferation of attacks on IT infrastructure, from the firewall moat surrounding the corporate network all the way down into the guts of the operating system kernel and deep into the speculative execution units on the physical processor, make the task of computing – with confidence – doubly difficult. It hasn’t helped that applications have become increasingly distributed and virtualized, spread across networked machines and propped up on various layers of software abstraction.

It is a tough job to keep ahead of the hackers, as the industry has seen very recently with the Foreshadow/L1TF vulnerability, which is forcing Intel customers to choose either VM security or performance; you can’t have both with the Xeon architecture in some cases since these vulnerabilities emerged last week.

With the “Naples” Epyc 7000 line of server chips, AMD has baked in a number of security features that make use of a specialized secure processor embedded in the chip itself that manages encryption keys to lock down data in various parts of the system. This secure processor and the Secure Memory Encryption (SME) and Secure Encrypted Virtualization (SEV) features that make use of it augment other security provisions, such as self-encrypting flash or disk drives for data at rest and network adapters that have engines on them for encrypting data in flight. With all of these provisions in place, companies can better harden their systems and the applications that run atop them.

The central feature of security on the Epyc 7000 chips is that secure processor, which is a 32-bit Arm Cortex-A5 processor that resides on the Naples system-on-a-chip design.

The main job of this secure processor is to generate secure cryptographic keys for locking down certain data within the system and to manage those keys as they are distributed. As a baseline, this secure processor allows for hardware-validated boot, which means it can take control of the cores as they are loaded up with operating system kernels and make sure their data comes from a trusted place and has not been tampered with. The secure processor can interface with off-chip non-volatile storage, where firmware and other data is held and encrypts the boot loader and the UEFI interface between the processor and the firmware; the secure processor has its own isolated on-chip read-only memory (ROM) and static RAM (SRAM), and it also has a chunk of DRAM main memory allocated to it that is not accessible by the cores on the Epyc 7000 chip and is also encrypted.

The cryptographic engine in the secure processor supports standard hashing and encryption algorithms – SHA, ECC, and RSA, for instance – and has a Zlib decompression engine as well to unzip compressed data when it is encountered; it also has a random number generator that is accessible to the CPU cores and their software. Importantly, the secure processor has access to all of the DRAM in the system, plus the MMIO space and the PCI configuration space in the system.

By itself, this secure processor can establish a secure boot of a system, but it can do a lot more than that, and in the Epyc 7000, it indeed does more than that. The SME and SEV features that debuted in servers with the Epyc 7000s make use of the secure processor on the chip, but they do additional things to lock down main memory as well as virtual machines and hypervisors, respectively.

To our knowledge, the hardware-assisted memory encryption that is enabled through the SME feature on the Epyc 7000 chips is unique in the datacenter, but Greg Gibby, senior product marketing manager for data center products in AMD’s Enterprise Solutions Business Unit, tells The Next Platform that this technology came from the custom processors used in Microsoft Xbox and Sony PlayStation game consoles starting five years ago, and is now deployed in a variety of PC and embedded chips.

With SME, all of the data that goes into DRAM is fully encrypted. To start SME, you flip a bit in the BIOS, and then the secure processor will issue a unique key to the memory controllers on each Naples SOC. There are a total of eight memory controllers on a single Epyc 7000 socket. The memory controllers all have an AES-128 encryption engine on them, so they do the encrypting in line, with minimal performance impact,

The main memory is encrypted with a single key across those memory controllers, and this is really designed to protect against physical attacks on the main memory. This can be used to encrypt systems running a bare metal operating system or a hypervisor. The secure processor issues the encryption key to the AES-128 cryptographic units on the main memory controllers, and the software never knows these keys. The main thing about SME is that it does not require any changes to the operating system or applications, and significantly, this a functionality that Intel, AMD’s main competitor in the X86 server space, currently does not have.

Running the SPECint_rate2006 processor benchmark, AMD saw an estimated 1.5 percent performance impact turning on SME. In most cases, the encryption and decryption of memory takes about ten clock cycles, resulting in somewhere between a 1 percent to 3 percent performance hit. This is not much, considering that access to main memory is locked down.

“This is really geared towards physical security, where you can’t guarantee the physical security of the server,” explains Gibby. “There are two use cases. The first is a James Bond scenario, where someone would potentially freeze the DIMMs in a running server, remove them, put them into a new server, and scrape the data off the DIMMs. This is probably not going to happen too often, but a more common and realistic scenario is a warm reset attack, where someone pushes the power button to do a reboot of the server and they put a USB drive into a server port and then redirect the server to boot from the USB drive, and now that you have control of that server and since you did a warm reset rather than a cold boot that would have wiped the server memory, all of the data in memory is still there and can be scraped off the DIMMs. With SME enabled, in either scenario, all of that data in memory would be encrypted and the thieves would not have access to the keys to decrypt it. The data would be garbage.”

AMD is seeing a lot of interest in SME where processing is moving out to the edge, where companies have less control of the physical server. For the most part, if servers are running in a private datacenter or a co-location facility where physical access can be limited and guaranteed, then companies may perceive of SME as being superfluous. (Well, until they have a warm reset attack from the inside.)

For many companies, the SEV feature enables further locking down server virtualization running on the Epyc 7000s. SEV employs the same basic approach, with the secure processor on the Epyc issuing a key to handle the encryption and decryption of data. But in this case it encrypts the memory space associated with the server virtualization hypervisor and its virtual machines. They all get unique, not shared, encryption keys, providing a level of isolation between these elements.

At the moment, SEV can only use sixteen unique keys concurrently – one for the hypervisor and the remaining fifteen for the VMs that run atop it – but over time we think that AMD will put a beefier processor in the “Rome” and “Milan” cores so one key can be generated for each thread in the processor plus the hypervisor. In a lot of cases, companies set up a single thread as the compute for a baseline virtual machine, so this approach would be useful. For those who need to run more than fifteen VMs per system, the remainder outside of the fifteen can be run without SEV but also have SME encrypting their data wholesale.

The SEV feature, as the name implies, works at the server virtualization layer, and does not do encryption at the container level. But any containers that are running atop VMs and hypervisors will, by default, have their contents encrypted and in such a way that the containers and their orchestrator are unaware of it. In a lot of enterprises, containers are not running on bare metal iron anyway, but on virtualization. The performance overhead of SEV is about 1.5 percent of CPU, according to AMD’s tests. The reason why is that SEV uses the same inline AES-128 encryption engines on the DRAM controllers as does SME.

The thing about SEV, though, is that the operating systems, hypervisors, and guest operating systems running in virtual machines have to have SEV enabled to take advantage of this capability. AMD has worked with the Linux community and upstreamed the code for SEV, and it has been accepted into the Linux kernel itself. The company is working with major partners – Red Hat, SUSE Linux, Canonical, and so forth – to have them pick the SEV features up and put them into their Linux distributions. AMD is also working with Microsoft and VMware to have this capability added into their hypervisors and operating systems. This will probably be done within the next year or so, we would guess. It is important to note that SEV does not require any changes to applications.

The scenarios for needing to deploy SEV are many, and they all center around cryptographic isolation to boost the security within virtual (and sometimes containerized) environments. “If you look at the way this works today,” says Gibby, “admins spin up a hypervisor and then spin up individual VMs as needed and they have complete visibility into those VMs. They could scrape memory off any individual VM and do whatever they wanted to do with that data. Given the fact that anywhere from 30 percent to 40 percent of data breaches were initiated through inside attacks, SEV is an important feature. With SEV enabled, the admins or anyone else does not have access to the memory used by the hypervisor or the VMs. Or, let’s say someone fell victim to a phishing attack and one of the VMs was compromised. In today’s environment, an experienced hacker can break out of that VM and into another VM and get data from them or even get access to the hypervisor and then have access to everything within them as well. With SEV enabled, even if a VM is compromised, they can’t break out of that and into the other VMs or even the hypervisors because they were all issued unique keys and when they look at the encrypted data it will just look like gibberish.”

Keeping systems secure is an ongoing battle, as we said at the start of this article, and AMD is gloating a little bit that the “Foreshadow” L1 Terminal Fault security vulnerabilities revealed last week – part of an evolving collection of security holes related to speculative execution and in this case affecting the Secure Guard Extensions (SGX) security features on Intel’s Xeon processors. SGX addresses some of the security issues that SME and SEV do, but AMD adds the functionality underneath the operating system, hypervisor, and virtual machines wherever possible, while SGX is an add-on to the X86 instruction set that allows for segments of memory affiliated with applications to be encrypted. But you have to tweak the applications to take advantage of SGX, and that is a problem for a lot of IT organizations.

Separate from SGX and not covered extensively after the initial disclosure, with the Foreshadow speculative execution exploits, it looks like in some cases companies will have to decide between having HyperThreading turned on or off on their systems, which is something that is not going to go down well with highly virtualized environments. While the performance penalties associated with mitigating against the Foreshadow attacks seem to be minimal in a lot of cases, that is not the point. In some cases, where the environment is heavily virtualized and you cannot guarantee the trust level on guest VMs, the overhead of the patches combined with turning off HyperThreading can eat half of the virtual threads and anywhere from 20 percent to 30 percent of the performance in the system. This is a very big deal. And AMD says that its Epyc chips are not affected by the Foreshadow exploits, so this can be a game changer in cloud environments where companies absolutely do not want to run the risk of compromised VMs and it might be a boon for AMD’s Epyc sales because of this.

There is more ahead for embedded security in the Epyc server chip line. “Security is always an ongoing battle, and we have plans to further harden and improve our security solutions at the hardware level over several generations,” says Gibby. At the moment, the L1, L2, and L3 caches on the Epyc processors are already covered in a way. They are not encrypted, but SEV hardware tags code all the data with a VM address space identifier that indicates the VM where the data originated from and its intended target. This tag is kept with the data at all times within the Epyc processor and prevents that data from being used by anyone other than the owner. This protects the cache and ensures that the data in the VMs remains isolated from each other and the hypervisor. AMD could do more explicit security here, such as directly encrypting cache as it does memory. AMD is already working with the Linux community to encrypt registers on the processors, for instance. The idea is to continue to innovate on security and give the Epyc chips yet another leverage point to get into the datacenter.