Could FPGAs Outweigh CPUs In Compute Share?

Nicole Hemsoth Prickett

4 years ago

Over a decade ago we would not have expected accelerators to have be commonplace in the datacenter. While they are not pervasive, a host of new workloads are ripe for acceleration and porting work has made it possible for legacy applications to offload for a performance boost. This transition has been most apparent with GPUs, but there are high hopes that FPGAs will continue to gain steam.

According to Xilinx CTO, Ivo Bolsens, who talked to us at The Next FPGA Platform event last week in San Jose, FPGAs won’t just gain incremental momentum, they will put the CPU out of work almost entirely. “In the future you will see more FPGA nodes than CPU nodes. The ratio might be something like one CPU to 16 FPGAs,” Bolsens predicts, adding that it’s not just a matter of device numbers, “acceleration will outweigh general compute in the CPU.”

This is a rather bold projection but there are some nuances to consider. Even for GPUs, the most dominant accelerator type, the attach rate is still in the single digits. However, in some large machines (HPC systems in particular) that acceleration represents 90-95% of the aggregate floating point capabilities, at least by current benchmark measures like Linpack. Of course, even with that capability for peak performance, that is not to say all applications reach full accelerated potential and more important, not all applications are primed for acceleration.

Bolsens says that while there are many legacy applications that might not ever fit the acceleration bill, emerging workloads throughout the datacenter will increase demand for FPGAs, especially given system-level trends, including a slow-down in Moore’s law and subsequent look to heterogeneous and domain specific architectures. Those are important at the node level, but he says the growth of FPGAs (and other accelerators) will be driven forward by disaggregation of resources (pools of storage, compute, and network appliances) which can all be used at the right proportions to serve different use cases.

He adds that it is within this context he sees the emergence of the FPGA as an accelerator and a building block to make compute more efficient. “The FPGA has fundamental characteristics that separate it from the CPU… FPGAs allow you to create more programmability, not just in terms of the compute resources and instructions, but also in terms of the memory hierarchy and interconnect.”

It is less controversial to make the claim that FPGAs will be pervasive throughout the datacenter, which is something both Xilinx, Intel, and others discussed during the conversation/interview-based event. The storage and networking pieces of the FPGA market puzzle are quite easy to snap into place. The dramatic rise in FPGAs as compute elements numerous and powerful enough to displace work done by the CPU is a more challenging thought to consider but it’s not out of the question, especially given the flexibility of a reconfigurable device (matched with the skyrocketing costs of custom ASICs and the application readiness state of some applications for GPUs).

Bolsens discusses disaggregation trends and how this will have an impact on FPGA adoption for compute purposes in the next few years in his keynote from The Next FPGA Platform event below.

Realistically, reaching the goal of multi-FPGAs on a single node and replacing CPU compute will take enough workload suitability. Bolsens says, “In analyses of workloads in the datacenter, there is no such thing as a dominating workload, generally nothing more than 10% but there are big compute challenges ahead driven by AI and machine learning and the fact that we’re moving into an era of IoT with massive analysis means there are new problems to drive new requirements. You will see domination of accelerated computing here and FPGAs will play a major role, they are a good match in terms of application characteristics and the architecture.”

These bold ambitions will take a great deal of effort from all players on the software side. “If you look at the various initiatives in the industry they are all siloed but they are trying to solve similar problems in how they handle parallelism and heterogeneity, shared memory models and distributed memory, and synchronization and dispatching. All of these things and their abstractions are similar. For our part, we are trying to deal with this by opening our programming environment so that over time, whatever your preferred environment is, we can connect to it and get high efficiency on our platform. None of this is on the near horizon, Bolsens says, but as the FPGA compute share grows overall, the industry will find ways to keep pushing forward through internal and collaborative efforts.