The Great Danes Get A Supercomputer For AI And Maybe HPC

To AI or to not AI, that is not even a question in 2024. And to need sovereign AI is also not a question. Which is the main reason that we see country after country, and large multinational companies, investing in AI infrastructure in a way that we never did see with traditional HPC simulation and modeling.

This week, it is the Danes.

Back in March, the Danish government, the Novo Nordisk Foundation, and AI system juggernaut Nvidia said that they would be working together to build a supercomputer for the newly created Danish Center for AI Innovation, and today that machine, which is nicknamed Gefion, is operational.

Novo Nordisk Foundation is the non-profit parent company of Novo Nordisk, the world’s largest producer of insulin for the treatment of diabetes and a pharmaceutical giant involved in many aspects of healthcare, including weight loss through the use of Ozempic and Wegovy. The foundation had a net worth $167 billion in 2023 and is also the largest charitable foundation on Earth, whose wealth is obviously fueled by the drug company, which had $33.71 billion in revenues and $12.15 billion in net income last year.

Gefion is the Norse goddess of the plow and harvest, and it is safe to say that Nvidia has certainly reaped what it has sown. Every country wants to have its own AI supercomputing center because the overwhelming consensus is that generative AI will touch all aspects of corporate, public, and private life.

And so, we can expect to see hundreds of countries want machinery at least on the scale of the Gefion machine, and also tens of thousands of other companies who will eventually want to assume control of their AI, albeit on a smaller scale, on top of the vast spending by the hyperscalers and cloud builders for their own AI applications and to rent out capacity to millions of companies who will add AI to their operations.

Any way you do this math, you have tens of companies with hundreds of thousands of GPUs, hundreds of organizations with thousands of GPUs, and tens of thousands of organizations with hundreds of GPUs. And they will add more to their GPU fleets as time goes by. It is many millions of GPUs, and several tens of thousands of dollars apiece. Which is how you get to a sustained revenue run rate in excess of $100 billion across the GPU makers. Some say it will be many times larger than this. We shall see.

By the way, DCAI is a new company and will be selling capacity on the machine to AI researchers and corporations. This is not a donation so much as an investment on the part of Novo Nordisk and EIFO in a new business opportunity. The chief executive officer tapped to run DCAI is Nadia Carlsten, who was a researcher in nanofabrication and DNA and RNA sequencing at UC Berkeley, a program manager and director at the US Department of Homeland Security, and the head of products for the center for quantum computing at Amazon Web Services. That is Carlsten in the middle, flanked by Nvidia co-founder and chief executive officer on the left and King Frederik X of Denmark on the right in the feature image above.

As these things go, Gefion is not a particularly large AI supercomputer, but it certainly will pack a lot of mixed precision wallop and enough 64-bit and 32-bit floating point oomph to increase the footprint for AI and HPC in Denmark.

Gefion is comprised of 191 Nvidia DGX H100 servers that have a total of 1,528 of Nvidia’s “Hopper” H100 GPU accelerators. The DGX 100s have a pair of Intel “Sapphire Rapids” Xeon SP processors each, and port 400 Gb/sec Quantum-2 InfiniBand ports for each of the GPUs in the cluster and an array of InfiniBand switches to crosslink them all. This SuperPOD machine is build by Eviden, the HPC and systems portion of French IT conglomerate Atos, which is in the process of being spun out and which drives €5 billion (about $5.4 billion) in revenues a year. (We still can’t believe that spinout is going to happen.)

The Gefion machine has a peak performance of 51.2 petaflops on the FP64 vector in the H100 device and 102.4 petaflops at FP64 on the tensor cores in the H100. Without sparsity processing turned on, the 1,528 GPUs in the system peak out at a collective 1.51 exaflops at FP16 precision; double that for the lowest FP8 resolution supported by the H100 GPU. Double it again if your data is sparse and can ignore half its bits filled with zeros to compress data and therefore get higher throughput.

In terms of price, Novo Nordisk Foundation is putting up 600 million Danish krone (about 14 cents to the US dollar) and the Export and Investment Fund of Denmark (EIFO) us putting up another 100 million Danish krone. If you add that up and multiply by the kroner to dollar exchange rate, it works out to $98 million. The machine will be hosted in a Digital Realty datacenter in Copenhagen, and one that is fueled completely by renewable energy. These figures do not include operating costs as far as we know, but it probably does include systems software and support costs. We have to guess because AI and HPC centers are imprecise when they talk about costs.

Here is where the rubber hits the road on machines using the same vintage of GPU or homegrown accelerators as the Gefion machine:

In some sense, the Gefion machine is very aggressively priced considering its relatively modest volume of GPUs. Some of that is due to the DGX H100 is so 2022 and 2023, but pricing is on par with what we think it costs for larger AI systems. More modern gear would cost more, we think. We have the GH200 systems with Grace CPUs and goosed GPUs, and the first round of “Blackwell” GPUs ramping now as well. While any AI/HPC center is happy to get whatever Nvidia GPUs it can these days, Novo Nordisk and its DCAI startup would no doubt have preferred to get much more powerful Blackwell machinery.

So how does the Gefion system stack up against other resources in Denmark. Well, Denmark gets 3 percent of the resources available on the “Lumi” system at CSC Finland. Lumi is rated at 531.5 petaflops peak at FP64 precision, so that equates to having a 15.95 petaflops machine at 64-bit floating point. And through the EuroHPC consortium, Danish companies and researchers can also apply for access to some of the other big iron in Europe as well. The Danish e-Infrastructure Consortium, or DeiC for short, coordinates access to national supercomputers, offering what it calls “interactive,” “large memory,” and “throughput” machines – presumably installed at various universities in Denmark – but we cannot find any feeds and speeds on them. The DeiC appears to include Aalborg University, Aarhus University, Copenhagen Business School, Technical University of Denmark, IT University of Copenhagen, University of Copenhagen, Roskilde University, and University of Southern Denmark.

And presumably Novo Nordisk has had systems of its own over the decades it has been doing research, but we cannot find any evidence of them in the recent Top 500 supercomputer rankings. We also presume that Danske Bank (finance), Moller Maersk (shipping), DSV (logistics), Novonesis (biosciences), Coloplast (medical devices), and a slew of other Danish corporations have supercomputers, too.

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

3 Comments

  1. Nice! At 51 PF/s in FP64, this Danish Gefion fits in the top 25 machines of Top500 (for this past June at least)! (BTW there’s a slight typo in the Table in my estimation, where these 51 PF are written as 0.51 EF rather than 0.05 EF — which makes it look “surprising” that FP16 perf is then just 3x that 0.51 at 1.5 EF/s; the cost per FP64 TF may need adjusting too)

  2. Danish HPC and AI users are not that thrilled unfortunately. As a nonprofit fund Novo can transfer money to it without paying taxes, with the expectation that the foundation will benefit society broadly. Yet they were allowed to create a company for Gefion and it is expected that Novo itself will be the main customer and that the foundation will give money to life sciences to buy time at Gefion, effectively recouping the money without paying taxes, and even getting 100 MDKK of public funds in the process. A part from that it is a big boost to Danish HPC capacity that unfortunately is an order of magnitude per capita below neighboring countries.

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.