AI Means Re-Architecting The Datacenter Network

SPONSORED FEATURE: There are a lot of pressures on the datacenter network these days just to deal with expanding data and the increasing use of microservices architectures for modern applications. But the advent of artificial intelligence – both for neural model training and for inference that drives applications – has really thrown a monkey wrench into the works.

Every problem, of course, is also an opportunity, and to take the pulse of datacenter networking and how AI is forcing system and network architects to rethink what they are doing for this very different workload, we sat down with Achyut Shah, senior vice president and general manager of the Connectivity Business Unit at semiconductor maker Marvell.

The compute engines that drive AI workloads – almost always GPUs but sometimes custom ASICs – are exceedingly fast, but keeping them fed with data so they can chew on it is a big problem.

“If you look at a CPU from days of old and compare it to a GPU or an accelerator of today, the pure processing power of this is at least an order of magnitude more – at least 10X, 20X, or 30X more than a basic CPU,” says Shah. “So by definition, it needs to take in data a lot faster to process that amount of data, and then also push it out a lot faster, also. Not only are you seeing each GPU block consuming a lot of bandwidth, but you are seeing a lot more of these together on a board and in a cluster when you talk about these large language models that are running. The clusters that are running these workloads have hundreds, sometimes thousands, and maybe tens of thousands of GPUs. And that is only going to keep increasing in the future.”

Clearly, that is going to mean that a whole slew of technologies, from the ASICs in the network switches to the host adapters in the servers to the transceivers in the network cables, are all going to have to be upgraded and operating in synch.

Marvell is one of the few vendors on Earth that can do it all when it comes to datacenter networking, and Shah laid out, from top to bottom, how Marvell can address the low latency and high bandwidth needs that AI workloads require – and whose requirements will probably grow at an exponential pace for many more years to come.

This content was sponsored by Marvell.

Slim Jim says:

September 27, 2023 at 9:43 pm

Fast-paced nonstop interview (the 30-minutes just flew by) wow! Two meters to 200km, AI bandwidth, network bottlenecks, latency, Teralynx, NRZ, PAM4, leaf-spine, Linear Direct Drive, DSPs, Co-Packaged Optics, costs, and convenience … lotsa ground covered!

Not a bad idea to review the TNP “1.6T Ethernet” piece in preparation for viewing this interview I think: https://www.nextplatform.com/2023/03/07/setting-the-stage-for-1-6t-ethernet-and-driving-800g-now/

Charlie Wuischpard says:

September 28, 2023 at 1:19 am

I do appreciate the subtle use of the Ayar Labs logo in the point to point fabric layout – even down to our corp colors!

- Timothy Prickett Morgan says:
  
  September 28, 2023 at 8:21 am
  
  Interesting…. Maybe they want to acquire you?
  
  - Hubert says:
    
    September 28, 2023 at 9:45 am
    
    An update on Ayar Labs CPO (eg. in relation to PCIe6/CXL3) could be interesting!
    
  - Slim Albert says:
    
    September 28, 2023 at 4:00 pm
    
    Or maybe Meta wants to buy both Marvell and Ayar Labs (if logos are an indication …)?

AI Means Re-Architecting The Datacenter Network

Sign up to our Newsletter

5 Comments

Leave a Reply Cancel reply

Sign up to our Newsletter

Related Articles

Marvell Cranks Up Cores And Clocks With “Triton” ThunderX3

Marvell Adds Hyperscale Ethernet With Innovium Acquisition

Stacking Up Arm Server Chips Against X86

5 Comments

Leave a Reply Cancel reply