Dell’s Omnia HPC Software Play

Almost three years ago, we wrote about Dell Technologies’ efforts to reassert itself into the HPC and supercomputing arena in a big way. The company had a history there, with its PowerEdge servers serving as the foundation for such supercomputers as the Stampede and, more recently, Frontera at the Texas Advanced Computing Center (TACC) and it had its PowerEdge-C systems, designed for such workloads as HPC, deep learning and data analytics by offering features like NVMe drives, high-speed memory, automation capabilities and liquid cooling.

At the SC18 supercomputing show, Dell spoke about leveraging the latest GPU accelerators from Nvidia (the Tesla T4) in several PowerEdge systems, expanding the field-programmable gate arrays (FPGAs) available in its systems and a hardware portfolio that included both Intel- and AMD-powered servers.

That said, even with all the talk of servers and chips, in 2018 Dell – similar to rivals Hewlett Packard Enterprise, Lenovo, IBM and Cisco Systems – was well underway into morphing into essentially a software company that also sells hardware, a transition that is ongoing. The rise of the cloud computing model, the exponential growth of data (up to 175 zettabytes by 2025, according to IDC), advanced workloads like data analytics and artificial intelligence (AI) and a growing desire by enterprises to reduce their need to manage their datacenter systems has fueled a shift that has seen Dell and other hardware OEMs to put a greater emphasis on developing automation, analytics, data management and security that can only be done through software.

Much of the focus now is on offering entire portfolios as a service (in Dell’s case, that’s its Apex initiative) and vendors like IBM are making aggressive pushes into rapidly-expanding hybrid cloud and AI spaces through acquisitions of software companies like Red Hat.

This transition has been underway for about a decade at Dell, where 85 percent of its engineers are now software engineers, Matt Baker, senior vice president of strategy and planning at Dell EMC, tells The Next Platform.

“Everything at this point – the value of our products, offerings, solutions, etc. – is largely driven by software automation and software innovation in general,” Baker says. “That’s something people have come to expect. It didn’t necessarily take a lot of concerted effort because it’s where our customers drew us. They pulled us in this direction because people got tired of having to run the machine instead of having a machine run itself. We’ve spent the last decade-plus building smarter and smarter systems that are largely smarter because of the software innovation.”

It’s not unlike what is happening in other industries, he says, noting that Tesla can be seen as both a car maker and a software company. The same can be said of agriculture-equipment manufacturer John Deere in agriculture and its work to bring greater automation capabilities to its machines. IT is no different.

“This notion of software vs. hardware companies has become a false dichotomy,” Baker says. “It’s an antiquated way of considering companies. People hate hearing the words ‘digital transformation’ because we use it so much, but at the end of the day, what is digital transformation? It’s really the embodiment of business processes into software and automating it. It’s everyone and everywhere. Certainly, it’s been our focus in terms of innovation for quite some time. The bending of the sheet metal and the printing of the circuit board, it’s all really important and we’re really good at that. But customer experience is almost exclusively driven by a software experience.”

That continuing shift toward software can be seen in the latest moves by Dell in the expanding HPC space. The company this week unveiled Omnia, an open-source software platform designed to help organizations more easily manage their HPC, AI and data analytics workloads through greater automation of provisioning and management tasks. The platform essentially creates a pool of resources that can be drawn upon to address workload needs.

Dell has had the Omnia software stack available on GitHub since March. It’s an open-source set of Ansible playbooks developed by Dell’s HPC and AI Innovation Lab, in in partnership with Intel and the HPC community, Caitlin Gordon, vice president of product management at Dell, said during a recent webcast with journalists. The stack manages systems running Slurm or Kubernetes, automating everything from cluster management and applications to accelerators and frameworks via open-source software.

“It’s all about simplifying and easing the deployment [and] the management of high-performance clusters for not just HPC, but AI and analytics workloads as well,” Gordon said. “When we look at Omnia, it’s really designed to help our community move faster, move more flexibly, but also be able to continuously develop and evolve the solutions stack.”

The goal is to save time by speeding up deployment and leveraging resources that can be reconfigured, she said. Omnia will “put the right software on each server based on your use case. Whether it’s HPC simulations [or] neural networks for AI, you’ll be able to reduce your time of deployment on an automated vs. a manual process from weeks down to hours. It’s all about flexibility. One of the coolest things about Omnia is that you can compose and recompose the stack based on what you need, really taking an infrastructure-as-code approach. … It really will allow you to support multiple users, multiple workloads types and compose and recompose that resource pool. That really uses these simplified, repeatable workflows that can enable you to build scale and to manage complex environments based on these component modules, profiles and roles that can really run anywhere.”

At the same time, Dell is expanding its HPC on demand offering with VMware support, via its partnership with analytics technology vendor R Systems, and – in a nod to hardware – is offering Nvidia’s A30 and A10 Tensor Core GPUs as accelerator options for the PowerEdge R750, R750xa and R7525 servers. That said, the announcement is primarily about software, Baker says.

“We have things like PowerProtect, our data protection portfolio, and it’s all software,” he says. “A lot of what we’re doing in terms of data analytics, innovation like our our Streaming Data Platform, is pure software. A lot of it is focused on data management and helping our customers extract value from data or automation of infrastructure, which is what the announcement is largely focused on. The focus of innovation for the last decade and going into the next is really focused a lot on automation. The hottest companies in the world right now are the RPA [robotic process automation] vendors, which is software designed to drive automation business processes. The world is quickly being automated. That includes IT systems. That includes the automobile. That includes all sorts of stuff. And at the heart of it is software.”

The shift to system management and automation can be seen starting in the storage space, when many midrange solutions that had various dials and knobs used to tune systems that later gave way to more automated, self-tuning and self-driving appliances, Baker says. A similar transition is happening in HPC, where for decades it was primarily the domain of academic and government labs.

“Increasingly, businesses in general – many sectors – have become so data-fueled that it requires either simulation or analysis and thus you end up in a world of a fairly robust HPC-like or heavy data analytics environment,” he says. “Pharma’s there, even agriculture is there in terms of driving crop yields. A lot of it requires significant horsepower and the government labs have folks who have long built these complex systems. Others have not and therefore, they rely on people like Dell to create the automation that allows them to build and operate these complex systems in a much more simplified fashion.”

For Dell, making the transition not a sudden realization that software was now driving the IT world. Instead it was about address enterprise demand for automation in management, provisioning, deployment and other areas. A key challenge for the company was something that organizations in general are facing: the need to move away from traditional approaches and to embrace new ones. In Dell’s case, that meant shifting development from a typical waterfall methodology to an agile, DevOps one.

The company has had to adopt a continuous integration/continuous development (CI/CD) approach that greatly accelerates the development process. Baker points to Dell EMC’s PowerStore all-flash system as an example, describing it as a “modern, container-based platform that enables us to significantly increase our innovation rate on the storage array itself. Increasingly, we’re moving from software releases that would come out every 12 to 18 months to a much faster pace.”

Dell’s efforts are still a work in progress.

“In terms of our internal development capabilities, we’re not at a fully enlightened state, but we feel like we’ve made the transition,” he says. “What we’re doing right now is really a lot of refinement … and efficiency. One of the things that we know going through this process of shifting towards the more DevOps-y [methodology] was how to drive up code reduce. Now we have more common platforms. That should sound familiar to anyone talking about developers today, because the number one thing that companies are driving is greater and greater developer efficiency. Where we are through the knothole in terms of the shift towards more modern development techniques, we’re in the refining and optimization side of the process.”

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

Be the first to comment

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.