When Immersion Cooling is the Only Option

With only one, perhaps two, percent share of the broader datacenter cooling market, immersion and direct liquid cooling are still fringe technologies. The reasons for that are as nuanced as they are numerous, especially for immersion.

While it has been around for a number of years, the reputation for maintenance, leaks, and spills has quickly outpaced word of the benefits, which include almost 100 percent thermal efficiency. These potential drawbacks – coupled with concerns over floor space and facilities integration (supporting the weight of the immersion tubs, for instance) as well as how introducing liquids might change support contracts with system vendors – are further complications.

But a day will come, quite possibly this decade, when it might be the only option – at least for high-density servers in datacenters supporting supercomputing applications or AI training, among other demanding workloads. A few HPC centers have gone the immersion route but again, these are the exceptions.

Lucas Beran, Principal Analyst for Dell’Oro’s datacenter infrastructure group, says right now less than one percent of the world’s servers are cooled by direct liquid or immersion cooling. “When I talk to engineers, they say they love the idea of immersion and liquid. But operationally, the barrier is the human element. Datacenter owners and operations don’t want to add liquids and oils to the datacenter environment. They are worried about mess and destroyed equipment.”

While the “human element” is keeping adoption of immersion technology at bay, Beran says the resistance cannot continue for much longer. “I don’t think we’ll get to broader immersion until we need to get there but we’re starting to be there. Densities are creeping up quickly and we’re fast approaching the tipping point.”

The movement to immersion will begin in environments with extreme densities. “Direct cooling a CPU or GPU direct to chip is not 100 percent heat capture. If you have a rack that generates 40kW of heat, you’ll still have around 4–8kW of heat that will escape, which means you’ll need hybrid cooling via attaching a rear door unit or air handler or some other form of air cooling to deal with that heat. In the future, with extreme rack densities in HPC up to 200kW if you’re only capturing 80 percent of that heat, there is still plenty to cool,” Beran says.

In other words, the time to start at least looking at what immersion cooling might require is now. Although implementation might be years away, architecting facilities and even support contracts with systems vendors ahead of time is critical. Air cooling might not be going away anytime soon but it will not be enough in some cases.

“We are really just at the beginning of a decade-long transition from air-based cooling to liquid-based,” Beran argues. “Perimeter is legacy technology, rack and row-level or rear-door exchanges, especially for high-density section or hotspots, are an intermediate piece in between that final frontier of thermal management, which is immersion cooling.”

If we are facing a future of immersion cooling, who are the vendors to watch now, how are they different, and how much innovation room is there for startups to tackle market share? And could all the standalone company momentum now (as small as it is) be upended if an HPE or Dell, for instance, decided to integrate it into its offerings for some of those highest-density environments?

Beran says Green Revolution Cooling (GRC) (which is installed at supercomputing site, TACC, for instance) is the leader now in immersion. Asperitas and Submer are honorable mentions with still others, including Isotope, garnering some mindshare. “There are other startups in the space with a couple proofs of concept now and the market could change rapidly, but I’m confident in GRC and Asperitas. To dethrone either of those would be difficult, although it’s still early.”

Differentiation for these and future companies now centers around taking a single-phase or two-phase approach to immersion. In single-phase, liquid goes into the tank, captures heat, then gets pumped through a heat exchanger before going back in. For two-phase, once the liquid in the tank hits a certain temperature, it vaporizes, rises, is redirected for cooling elsewhere, then condenses back into a liquid. The latter has a higher CAPEX and while it provides a cooling improvement, so far that’s “marginal” according to Beran.

The other area of differentiation is in the engineered fluid that sits in the tank. Asperitas is working with Shell to refine its medium, and 3M is working on its own immersion fluid. But right now, it is hard to say what a big difference improvements in the fluid will make. In short, there may be room for startups to differentiate technology-wise – but for something that is already a tough sell, offering an incremental improvement on single-phase might not resonate and going with a two-phase approach might add even more complications and up-front cost. So far, by the way, the leading immersion companies mentioned are all single-phase. Also, even though it’s GRC material, this is a nice explainer about the differences between single- and two-phase.

Perhaps the only differentiation that could make practical sense at this early stage is if immersion became a core offering – with full systems-level support from major OEMs providing the bulk of datacenters for those high-density datacenters in HPC or AI training. For instance, if HPE/Cray had an immersion option that guaranteed those systems – servers, storage, networks, and immersion tanks. Right now, a concern is that introducing immersion or direct liquid could invalidate a support contract. If a major OEM bought a GRC, for instance, and cooled a specific line for these use cases, it would be a different set of considerations.

All of this leads back to the questions at the front of datacenter operators’ minds: What messes and dangers to gear do these tanks represent? What about safety due to slips? What about messes and maintenance? There is a chicken/egg scenario here. More centers will have to adopt immersion cooling and share their challenges openly so others can gauge risk. But no one wants to go first.

Other than those most talked-about reasons, there are other practical concerns that keep immersion at bay, including on the facilities side. These tanks are not small – they take up a fair amount of datacenter floor space and do require some engineering to support. In other words, it takes planning to implement. And if everyone is waiting for someone else to go first, the whole rollout of more immersion examples is further delayed.

And speaking of delays, even though immersion might the only way to cool high-performance hardware in the next decade, its growth has been further hindered by the pandemic. Beran says that it’s a “high-touch” purchase from the beginning that required boots on the ground along the way. Even barring any further shutdowns, 2020–2021 could have pushed immersion’s entry into more mainstream cooling farther back still.

The last question is whether there is opportunity for any startup hoping to secure early footing in what looks to be one of the only options for near-100 percent heat capture. The answer depends on who gets acquired by a major OEM with reasonable grounding in HPC/AI server markets (HPE, Dell, Lenovo and to be polite, perhaps IBM). Internationally, companies like Fujitsu – which pioneered immersion cooling for supercomputing in particular – have already invested, although we still do not see a lot of large publicly-listed systems that feature immersion.

The OEMs have struck up partnerships with GRC in particular (Dell and HPE, the latter also has a partnership with Iceotope) but what big centers want is integrated support for systems with this unique and particularly risky technology. It is not out of the question that one will be bought, forcing a new competitive landscape in a game that’s still far too early to call.

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

2 Comments

  1. I have several years of recent “hands-in” experience with Green Revolution Cooling vats, used in an older data center to augment the original air cooled data halls…Yes its a bit “icky” to deal with during server repairs (but I don’t think the mineral oil based fluid is dangerous)…However I do question two things: 1) The best way to handle switch placement and network cabling (so as ovoid “wicking” oil out of the vats), 2) What server recycler is going to want to take gear covered in oil, when said servers age out?

  2. I wish to add my commentary in response to this article “When Immersion Cooling is the Only Option” by Nicole Hemsoth.

    • Direct liquid cooling (DLC) is not a fringe technology
    • Dielectric Direct Liquid Cooling (DDLC) is 100% IT compatible, unlike immersion that needs to replace components and have some of the components stick out of the immersion liquid
    • Immersion cooling may have high thermal efficiency up to ~100 kVA racks, while DLC sales beyond this limit
    • Immersion cannot solve the lower T-Case temperature of next-generation processors
    • Two-phase immersion has a higher CAPEX when compared to single-phase immersion, but immersion is significantly higher in CAPEX and OPEX when compared to two-phase evaporative solutions
    • Floor loading constraints are limiting many retrofit data centers, with immersion being the heaviest solution on the market
    • This link only compares immersion single and two-phase solutions and is missing DLC comparison, which would change the results significantly
    • Two-phase evaporative cooling for DLC, combined with a rear-door capture system, can capture even more heat than immersion

    We face a future of higher TDP and lower T-Case temperatures that only well-engineered, waterless, dielectric two-phase solutions like ZutaCore can solve. Being provocative, I would have titled the article “When Immersion Cooling is Never an Option.”

    Industry, using sound science, needs to understand and report on the following wrt immersion cooling:
    1. Single-phase immersion is limited in TDP and will not be able to cool the higher TDP CPUs coming from Intel Corporation and AMD
    2. Two-phase immersion is creating cavitation damage to the IT equipment

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.