The Next Platform
  • Home
  • Compute
  • Store
  • Connect
  • Control
  • Code
  • AI
  • HPC
  • Enterprise
  • Hyperscale
  • Cloud
  • Edge
Latest
  • [ December 6, 2023 ] Accelerate time to insight for AI and HPC AI
  • [ December 5, 2023 ] Finding NeMo Features for Fresh LLM Building Boost AI
  • [ December 4, 2023 ] The Bespoke Supercomputing Architecture That Stood the Test of Time HPC
  • [ December 4, 2023 ] How AWS Can Undercut Nvidia With Homegrown AI Compute Engines AI
  • [ December 1, 2023 ] If You Want To Sell AI To Enterprises, You Need To Sell Ethernet Connect
  • [ December 1, 2023 ] Meta Sees Little Risk in RISC-V Custom Accelerators Compute
  • [ November 30, 2023 ] Arrow Hits the Mark for Petabyte-Class Analytics Problems Control
  • [ November 30, 2023 ] Redefining datacenter connectivity with open source networking AI
HomeFault Tolerance

Fault Tolerance

HPC

Who Shoulders the Supercomputing Resiliency Burden?

January 11, 2021 Nicole Hemsoth Prickett 0

While the related topics of fault tolerance and resiliency do not garner the same attention as performance and efficiency, being able to recover from and work around failures, especially as applications take over ever-large and increasingly heterogenous machines, is more important than ever. …

About

The Next Platform is published by Stackhouse Publishing Inc in partnership with the UK’s top technology publication, The Register.

It offers in-depth coverage of high-end computing at large enterprises, supercomputing centers, hyperscale data centers, and public clouds. Read more…

Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

  • RSS
  • Twitter
  • Facebook
  • LinkedIn
  • Email the editor
  • About
  • Contributors
  • Contact
  • Sales
  • Newsletter
  • Books
  • Events
  • Privacy
  • Ts&Cs
  • Cookies
  • Do not sell my personal information

All Content Copyright The Next Platform