The way that organizations plan, design and run a datacenter was already under pressure. AI has turned that pressure into a once-in-generation stress test, one that is inspiring a top-to-bottom rethink of what the datacenter does, how it does it, and where.
“What we’re seeing now is seismic,” says Dave Strong, director of advisory and professional services for UKIMEA at Hewlett Packard Enterprise. “This is the biggest change we have ever seen, holistically, in datacenters.”
For years, organizations have treated the management of a datacenter as an asset to sweat. While an efficient, secure, performant datacenter estate was fundamental to the business, most were running relatively predictable workloads. Now, introduce dynamic and demanding AI workloads into the mix. AI-driven innovation in the business tests whether the datacenter infrastructure can deliver.
HPE’s answer to that question is based on three things: AI readiness, connectivity, and energy. To support those three pillars, it has developed solutions that transform the datacenter for AI.
A Revenue Generating Asset
The ability to run AI workloads can determine whether you launch a new digital product on time and successfully automate a business process – or get buried in data and fall behind more agile rivals. So HPE claims we should think of a future-ready datacenter as a revenue-generating asset in which to invest, not a cost center to minimize.
That revenue impact might show up in several ways:
- Faster time-to-value: More efficient infrastructure, software, and services process accelerate the organization’s productivity curve.
- Full-stack optimization: Software running more efficiently on optimal hardware can lead to rapid improvements in what AI can deliver to the business.
- New high-performance services: More potential for innovation in novel applications of AI.
- Energy and heat reuse: What used to be waste can offset costs.
“If customers don’t think about how they switch from treating the datacenter as a commodity to something that changes how they make money, they’re going to be stuck,” Strong warns.
A Full-Stack Design For AI Workloads
For organizations that want to run AI in their own datacenters, HPE offers the AI Factory. This is an end-to-end approach that treats the datacenter as a production system. In go power, data, and compute. Out come tokens in the form of insights, decisions, and new digital services.
AI factories demand a different engineering discipline. HPE’s AI Factory portfolio is built as a full stack. Creating this as a coherent stack lets IT leaders optimize overall cost and risk, instead of firefighting at each layer independently. Infrastructure, software, and services are designed together to power AI factories, in HPE’s terminology, “from edge to exascale.”
Strong breaks that stack down into familiar layers:
- Infrastructure: Power, cooling, and the ability to stand up new capacity.
- Compute and storage: Achieving the optimal balance between traditional CPUs and accelerated GPU platforms for AI workloads.
- Software stacks: Delivering data quality, model development, and platform access.
- Operational tooling: Automation and AIOps that keep the system running with minimal human interaction.
“Hewlett Packard Enterprise is in a great situation because we do have an end-to-end capability,” Strong says. “We make datacenters, we have the compute and storage you associate with AI workloads, we have the software stacks for generating use cases and driving data quality, and the platform accessibility for organizations to get the outcome they’re trying to achieve.”
The infrastructure built by TELUS using HPE’s design concept in Canada is an example of the country’s first fully sovereign AI factory in Rimouski, Quebec, created to serve customers in highly regulated sectors including public services, healthcare, critical infrastructure and financial services. The AI factory is 100 percent Canadian-controlled and is powered by 99 percent renewable energy, using infrastructure from HPE. Across the Atlantic, The IsambardAI supercomputer at the Bristol Supercomputing Center is built on HPE’s ModPod architecture. HPE also provides the AI-factory infrastructure, software stack, and compliance-ready architecture that power services from the UK’s national AI initiative, Carbon3.ai.
Location, Power, And The Case For Modular Datacenters
Traditional datacenter planning starts with real estate: find a site, get planning permission, build a big facility, and fill it over time. AI workloads break that business model.
Strong notes that while a conventional IT rack might draw six to eight kilowatts, AI racks on the latest roadmaps can push toward 600 kilowatts. That is an order-of-magnitude shift that turns location into an energy and grid problem, not just a space problem. The power profile alone can lead datacenter managers to tear up their existing development plans.
HPE has several answers to this problem. One way to quickly deploy AI-native datacenters located close to energy sources or in socially convenient places is to deploy AI ModPods: small footprint, high-density containers deployed quickly, close to suitable power sources, including renewable generation. In the UK and similar locations, Strong points out, the process of obtaining permits and building a classic datacenter can easily take 18 to 24 months. Prefabricated designs can cut that to six months or less, the timescale needed to remain at the forefront of AI innovation.
The modular concept means energy, location, and sovereignty can be addressed together. Businesses can place high-density AI clusters where there is clean power, they keep latency-sensitive applications closer to end users or data sources and let businesses scale out with additional modules rather than betting everything on one mega-site.
Those with existing data hall space must think about making it as efficient as possible. One approach here is to enable direct liquid cooling, which HPE can now do across the full rack.
Energy-First Design And Heat Reuse
AI factories are inevitably power-hungry by design. Energy strategy – always important – becomes a first-order architectural decision. Strong frames this decision in two stages.
“If we’re talking huge amounts of power, the first thing you’ve got to consider is where you’re going to get that power from,” he says. “The second is what you’re going to do with the heat that comes off those platforms – and how you’re going to reuse it.”
HPE’s partnership with Danfoss is aimed squarely at that second question. The companies are combining HPE’s modular datacenters with Danfoss heat reuse technology to cut datacenter energy consumption and route excess heat into local heating systems. HPE’s modular facilities use direct liquid cooling to reduce overall energy consumption by 20 percent, while Danfoss heat reuse modules can capture that “waste” thermal energy and feed it into district heating networks or industrial applications.
This means that, an AI factory can lower its Power Usage Effectiveness (PUE) by improving cooling efficiency. HPE’s modular datacenters have a PUE of 1.1. improve its Energy Reuse Factor (ERF) by exporting heat into nearby buildings or process heat applications, and support local sustainability goals. Potentially, it could even generate revenue from heat-offtake agreements.
Networking As A Circulatory System
AI workloads involve moving large amounts of data between edge locations, training clusters, and downstream applications, under tight latency constraints for near-real-time applications. The most elegant AI factory design will still fail if the network becomes a bottleneck. Secure, AI-ready datacenter networks simplify and automate fabrics. They use AIOps and intent-based networking to maintain performance at scale. This means high-speed interconnects within the datacenter for GPU-rich clusters and intelligent routing between edge sites and central resources. Also, security controls that assume these sensitive AI workloads will be distributed across multiple locations.
Strong argues that the trick is not just to throw bandwidth at the problem, but to achieve performance through balancing which decisions are made at the edge, and which need to move into central AI clusters. As an example: when running a project with a customer that operates satellites, pushing all data first to a relay station and then on to the datacenter wasn’t viable; HPE created an architecture to process data close to the relay stations and move only what’s needed.
Automating Operations With AI
You can’t operate an AI factory with yesterday’s runbook. The complexity and speed of change make traditional, manual operations unsustainable. Strong is blunt about the goal: day-to-day operations should be “as light touch from a human engineering perspective as possible.”
That demands observability and AIOps platforms that continuously monitor applications, networks, and hardware, then take decisions automatically. They detect when an application isn’t behaving as expected, predicting hardware failures before they occur, and move workloads to another node when something looks likely to fail.
“Fundamentally, it’s about proactive maintenance,” Strong says. “We want those engineers to go and do highly intelligent things, creating AI use cases and getting organizations in the best shape to consume AI in the future, rather than spending their time firefighting infrastructure.”
Sponsored by HPE.
Tim Phillips writes about business/economics/development and is a researcher at the Institute for New Economic Thinking.