If you want to sell a lot of hardware to support AI workloads, then the best way to do that is to convince every country on Earth that AI is so important that they must have a lot of it within their borders. Just in case some political or economic crisis makes AI technology unavailable through the world-spanning cloud builders.
Getting massive numbers of GPUs into the hands of the hyperscalers and cloud builders was the best way to scale up GenAI models to show what they are capable of, and to also illustrate the strategic importance of GenAI to all countries. But the big clouds are controlled by companies in the United States and China, which makes many leery.
As Nvidia ramps up production of its “Hopper” and “Blackwell” GPUs, and has found a way to increase its manufacturing output by enough to supply the hyperscalers and clouds as well as governments and enterprises, this whole notion of sovereign AI clouds has taken off. And – you guessed it – Nvidia will be the biggest beneficiary of this movement.
Last month, Nvidia chief executive office Jensen Huang made the case for this “sovereign AI” view in India, announcing partnerships with such major IT corporations and cloud providers in the country as Reliance Industries, Infosys, Yotta Data Services, Tata Communications, and Tech Mahindra to grow India’s use of AI throughout its economy.
“It makes complete sense that India should manufacture its own AI,” Huang said at the time. “You should not export data to import intelligence.”
It’s a boon for these countries, giving them control of an emerging technology that could in many ways fundamentally change how business is done and societies operate. Sovereign AI also will be a big win for Nvidia as it spreads its message and, more importantly, around the globe. Nvidia expects its GPU deployments in India to grow almost 10X by the end of the year.
This week, the focus was on Japan at the GPU maker’s AI Summit in Tokyo, where Huang sat down with SoftBank chairman and chief executive officer Masayoshi Son to talk about his company’s and the nation’s broad ambitions for leveraging AI and using Nvidia hardware and software to do it.
Japanese officials want to lure AI companies to the country by making it AI-friendly reportedly through a light regulatory approach. Huang and other Nvidia executives praised Japan’s history of innovation and the broad technology and AI skills in the country.
“Japan is at the forefront of this transformation and the Japanese cloud leaders adopting Nvidia AI infrastructure will help Japan transform its most vital industries and advance its sovereign AI ambitions,” Ronnie Vasishta, senior vice president of telecom at Nvidia, told journalists in a video call.
Nvidia noted an array of Japanese companies its partnering with in AI projects, though its work with Japanese multinational SoftBank took center stage. SoftBank will be the first company to use Nvidia’s Blackwell platform to build its first – and the country’s most powerful – supercomputer.
The company will use Nvidia’s DGX B200 systems as the foundation for its Nvidia DGX SuperPOD supercomputer that it will use for its own generative AI ambitions and other businesses as well as universities and research institutions in Japan also will be able to access. The DGX B200 systems, which were first introduced in March, include eight Blackwell GPUs that are connected with the vendor’s fifth-generation NVLink interconnect, which provides 1.8 TB/sec of bidirectional throughput per GPU.
The platform also includes a dedicated RAS (reliability, availability, and serviceability) engine and a decompression engine for faster database queries. According to Nvidia, the platform delivers three times the AI model training performance and fifteen times the inferencing performance of its predecessors.
The SoftBank supercomputer also will include Nvidia’s AI Enterprise software and Quantum-2 InfiniBand networking, which will enable easier development of large language models (LLMs) that – in keeping with the sovereign AI push – can be built for Japanese-speaking users.
SoftBank also plans to build another Nvidia-based supercomputer using Nvidia’s Grace-Blackwell platform, which includes its GB200 NVL72 liquid-cooled rack-scale systems that use both Blackwell GPUs and Arm-based Grace CPUs. The system will be used for highly compute-intensive workloads.
Another project between the two involves SoftBank’s development of what it and Nvidia are calling an AI-RAN, a telecom network that runs both AI and 5G workloads at the same time.
“Democratizing AI requires building a national AI infrastructure. AI factories are required to create intelligence, build and train new models serving the industries of Japan,” Nvidia’s Vasishta said. “A delivery network, though, is also required to distribute intelligence, enable AI inferencing as close as possible to the endpoints. AI applications are required to consume the intelligence through AI-native endpoints.”
The network uses a software-defined 5G radio stack that includes L1 software based on Nvidia’s Aerial accelerated platform. The two companies ran an outdoor test of the AI-RAN network in an area of Japan and go carrier-grade 5G performance and ran AI inferencing jobs on the excess capacity in the network. Softbank estimates that traditional telecom networks are designed to handle peak loads and usually are only using a third of the capacity.
SoftBank says the Nvidia Aeria RAN Computer-1 systems, which it will incorporate into the environment, will use 40 percent less power than the infrastructure that runs traditional 5G networks. SoftBank’s Son said while on stage with Huang that “with this intelligence network that we densely connect each other, it will become one big neural brain for the infrastructure intelligence to Japan.”
For telecoms, it opens up other revenue streams created by being able to run those AI workloads on the same networks. Both Nvidia and SoftBank estimate that for every $1 of capital investment in new AI-RAN infrastructure, telecom companies earn about $5 back in AI inference revenue. SoftBank is estimating that it will get a return of up to 291 percent for every AI-RAN server it adds.
Nvidia boasted of other partnerships, including those with cloud companies like GMO Internet Group, KDDI, and Sakura internet to build a national AI infrastructure network based on Nvidia hardware and software that will drive AI-fueled innovations in such sectors of Japan in such areas as robotics, health care, and drug research.
Some interesting developments here … and questions! I wonder if this supercomputer is going to be “the country’s most powerful” in both HPC and AI (besting Fugaku), or just AI. MS’ Azure Eagle did pass Fugaku’s HPC perf with Xeons and H100s, and one could surely build a machine with twice the performance of Alps’ GH200s to get there as well (or 5x that of Venado), but is Softbank interested in HPC (beyond AI)?
AI-RAN is also quite puzzling, and Softbank sure has been developing that concept for some time ( https://www.softbank.jp/en/corp/technology/research/news/052/?adid=mp ) … it’s conceptually interesting ( https://ai-ran.org/ ), but its utility is not completely clear (to me at least) … maybe we have to wait and see the benefits and drawbacks of that novel tech(?).
And maybe Nvidia or Softbank will talk about it at SC24 … starting Sunday (yippee!)!