The Next Platform
  • Home
  • Compute
  • Store
  • Connect
  • Control
  • Code
  • AI
  • HPC
  • Enterprise
  • Hyperscale
  • Cloud
  • Edge
Latest
  • [ November 28, 2023 ] AWS Taps Nvidia NVSwitch For Beefy AI GPU Nodes Compute
  • [ November 28, 2023 ] AWS Adopts Arm V2 Cores For Expansive Graviton4 Server CPU Compute
  • [ November 28, 2023 ] HPC Pioneers Pave The Way For A Flood Of Arm Supercomputers HPC
  • [ November 27, 2023 ] Groq Says It Can Deploy 1 Million AI Inference Chips In Two Years AI
  • [ November 22, 2023 ] Nvidia Proves The Enormous Potential For Generative AI AI
  • [ November 20, 2023 ] Trying To Do More Real HPC In An Increasingly AI World HPC
  • [ November 17, 2023 ] Pushing The Limits Of HPC And AI Is Becoming A Sustainability Headache AI
  • [ November 17, 2023 ] What To Do When You Can’t Get Nvidia H100 GPUs AI
HomeDMTCP

DMTCP

Store

Memory Snapshots Bring Checkpointing Into The 21st Century

December 9, 2021 Timothy Prickett Morgan 0

When you have a massively distributed computing job that can take months to run across thousands to hundreds of thousands of compute elements, one software hardware or software crash can mean losing an enormous amount of work. …

About

The Next Platform is published by Stackhouse Publishing Inc in partnership with the UK’s top technology publication, The Register.

It offers in-depth coverage of high-end computing at large enterprises, supercomputing centers, hyperscale data centers, and public clouds. Read more…

Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

  • RSS
  • Twitter
  • Facebook
  • LinkedIn
  • Email the editor
  • About
  • Contributors
  • Contact
  • Sales
  • Newsletter
  • Books
  • Events
  • Privacy
  • Ts&Cs
  • Cookies
  • Do not sell my personal information

All Content Copyright The Next Platform