Server Encryption With An FPGA Offload Boost
October 3, 2016 Timothy Prickett Morgan
Everyone talks about security on infrastructure, but it comes at a heavy cost. While datacenters have been securing their perimeters with firewalls for decades, this is far from sufficient for modern applications.
Back in the early days of the Internet, all traffic was from the client in through the web and application servers to the back-end database that fed the applications – what is known as north-south traffic in the datacenter lingo. But these days, an application is a collection of multiple services that are assembled on the fly from all over the datacenter, across untold server nodes, in what is called east-west traffic between those nodes, before it is all assembled and sent out the north-south link. Depending on who you ask, this east-west traffic represents more than 80 percent of the traffic in the datacenter today. But here is the dirty little secret: Very little of that east-west traffic is encrypted and it is therefore insecure.
“Security really needs to be thought of more holistically and covering the whole datacenter as opposed to just at the edges,” Bob Doud, senior director of marketing at Mellanox Technologies, tells The Next Platform. “There is a lot of chit chat going on between servers, and between servers and storage, and when you combine that with all of the virtualization that is happening, which is good in that it drives the server utilization up, but it also drives up the bandwidth needs for networking. With all that bandwidth, we are no longer seeing servers with 1 Gb/sec ports, they are at 10 Gb/sec and moving up to 25 Gb/sec, 40 Gb/sec, and 50 Gb/sec. You have a need for security everywhere, but the pain points are higher because the bandwidths are rising.”
Given the distributed nature of both users and the applications running across a datacenter, there is a need to secure the transmission of information as it flits around server and storage nodes in the infrastructure, which is one of the reasons why chip makers like Intel, IBM, and Oracle added encryption accelerators to their respective Xeon, Power, and Sparc processors. This is not just a function for a firewall or a web browser any more. But encryption and decryption processing are not free. Neither are virtual LANs to carve up Layer 2 networks or access control lists in databases, file systems, and data stores for restricting access to bits of data.
Doud says that IPSec encryption, which has been around for a long time, is seeing a resurgence in the datacenter due to the need to secure inside of the datacenter, and adds that all of the hyperscalers are using SSL encryption out to clients and want to use IPSec internally across their servers because it enforces separation of traffic in a strong manner.
Here’s the issue. You can hack into a VLAN and see the traffic on it, but with IPSec, if you don’t have the keys, you are not going to be able to read the data being transmitted even if you do hack the VLAN. IPSec basically has a stateless firewall engine in it, by default, you can set up policies to allow or disallow certain kinds of traffic or to encrypt the data – or not – as you see fit. So you can understand why IPSec is a key component of the micro-segmentation of networks that hyperscale datacenters as well as cloud builders and some large-scale enterprises are trying to craft.
“We are getting requests from all kinds of customers who want to deploy IPSec on every single server,” says Doud. “And their results have shown that doing IPSec just in software yields poor results. Taken together, this is a tremendous amount of cryptography that is being deployed, and it is therefore an increasing load on all of the servers that terminate those SSL and IPSec points.”
The wave of network function virtualization (NFV), which is a fancy way of making generic X86 servers do more of the work that would historically go on special-purpose network devices, would seem to imply that this work can be shifted to the cryptographic functions on Xeon, Power, Sparc, or other processors. But running IPSec on Xeon cores is far from free. As an example, if you look at the list price of a 14-core “Broadwell” Xeon E5-2680 v4 processor, which costs $1,745, and subtract out the cost of a ten-core E5-2640 v4 chip, which costs $939, the extra four cores cost you $806, or $201.50 each. Those incremental cores are quite pricey, representing 46 percent of the cost but only 29 percent of the aggregate performance of that 14-core chip. (Both of these chips run at 2.4 GHz.) The AES-NI instructions in the latest several generations of Xeons can help accelerate encryption, but they do not handle hashing algorithms, so this only helps so much, and perhaps more significantly, according to estimates made by Mellanox, supporting a 40 Gb/sec link on a server with IPSec encryption requires eight cores on a Broadwell Xeon E5 v4 processor. The performance of encryption with lots of small packets of data using the AES-NI instructions is not as good as with large packets (because of the file handling overhead). Hardware-based cryptography on an offload engine does not see the same rapid fall off as packet sizes shrink, says Doud. With hardware-based cryptography (meaning a fully dedicated offload engine for SSL or IPSec), the throughput falloff is on the order of 2:1 from large packets to small ones, so a 20 Gb/sec link with large packets being encrypted you would expect around 10 Gb/sec of throughput with small packets. It looks like more like 4:1 or 5:1 with AES-NI running software-based encryption (although this is admittedly pretty old benchmark data on Westmere Xeons that Mellanox is citing.)
At around $200 per core, if you want to use that as an average, having software based encryption adds up to around $1,500 of compute per Broadwell server node for each 40 Gb/sec link that is encrypted. The cost has come down since AES-NI instructions were first added to the “Westmere” Xeons in 2010, but encryption is far from free, by the math that Doud is doing. Even if users get a 50 percent discount on CPUs, encryption for locking down the information in flight inside the datacenter is still a big percentage of the CPU budget, with encryption representing maybe a third of CPU cycles.
So, Mellanox has an answer to this encryption conundrum, and as you might expect knowing the company’s history in networking, it wants to offload encryption from the server processors to the network adapter cards. To that end, it has forged its Innova line of security acceleration network adapters, which marry a ConnectX-4 LX adapter card with a Xilinx Kintex UltraScale FPGA that has been programmed to handle IPSec, SSL, and TLS encryption protocols.
The Innova card plugs into a PCI-Express 3.0 x8 adapter slot in a server and has a single Ethernet QSFP port that supports 10 Gb/sec or 40 Gb/sec links out to switches. The FPGA sits between the ConnectX-4 LX ASIC and the QSFP port, which means the server only sees unencrypted data and that data only has to pass over the PCI-Express bus once instead of coming into the server over a network port and then offloading to a separate encryption/decryption accelerator and then coming back over the PCI bus one more time to get back to the CPU. The FPGA on the Innova card has its own 2 GB of dedicated DDR4 memory to act as a buffer for network data.
The IPSec protocol has been programmed into the FPGA now and is sampling with the card, and the SSL and TLS protocols will be available on the device in the first quarter of next year. Doud says that Mellanox is also looking to add a peppier Virtex FPGA on a future card, and in the first quarter there will be a ConnectX-FPGA mashup that offers 25 Gb/sec 50 Gb/sec ports. The ConnectX-4 LX card with the FPGA uses the same drivers for Linux, FreeBSD, Microsoft Windows, and VMware ESXi that the regular ConnectX-4 LX card uses, only it magically has encryption offload.
Mellanox is not releasing pricing on the Innova cards, but that plain vanilla ConnectX-4 LX card runs around $500 or so at list price, and Doud says that the Innova card will probably cost around three times that. Which is, not coincidentally, about what the value is of the Xeon cores that have to run encryption in the Broadwell generation. So there is no net change in the encryption function pricing, but a big chunk of CPU is made available to do real work, and that is worth real money.