Whether we want to term it HPC, AI, hyperscale, or something in between to apply to large-scale enterprise, there are fundamental changes afoot for big organizations with large clusters that are doing double (or quadruple) duty with production workloads and the new integration of machine learning.
This integration happens differently depending on the problems being solved, but from a systems level, many of the concerns are the same.
This was part of what was driven home at The Next AI Platform event on May 9 where we brought together leading thinkers from different parts of the AI hardware stack to determine what specific challenges AI creates, what can be done to address them, and how (and where) those bottlenecks might get pushed to other parts of the system.
Among the topics discussed along those lines was composability and what future systems might require.
“We think of it as advanced computing defined in terms of bits, qubits, and neurons in the context of an information architecture,” IBM’s VP of Exascale Computing, Dave Turek, told the audience during his live on-stage interview with Next Platform co-founder and co-editor, Timothy Prickett Morgan.
The two unwound the notion of composability and what it means as workloads grow more complex and cross between HPC, AI, and large-scale analytics in the recording above from the event.
“Composability comes everyday; it is a combination of hardware and software. Schedulers today are cognizant of where data is located, of energy consumption and making adjustments on the fly. There’s a way to begin decomposing a workflow and making assignments based on the nature of the location of the data or invoking what hardware elements need to be brought to bear.”
Turek says that composability is a journey that takes place over time. “When you take workflows to the extreme you need to make an accommodation for classical and quantum and infuse both with elements of AI as part of an information architecture.”
“Neurons is the AI piece, bits means classical HPC, and qubits is quantum, of course, but it all needs to sit on a sea of an information architecture. Note that I didn’t say data, I said information architecture, which means having the data infused with some sort of ontology so that we can operate on it and make good use out of it. Implicitly, this means we need to denominate work in terms of workflows, not in terms of algorithms, because workflows are more important to the enterprise than the execution of a particular algorithm. And by further implication, we want to have the ability to deliver the right set of components at the right time.”
That part might have elements that stack together but overall, there is a change in thinking necessary as we talk about large-scale systems. The same problem can be presented to different camps and each will have a unique solution. For instance, the HPC community might tackle it with modeling and simulation, the AI camp would go after the problem with neural networks, but this should not be the case so much as building balanced systems composed of the elements to do both without worrying about which concrete way to tackle the problem.
“We have this debate about HPC versus AI, or servers versus storage, and so on. All of those boundaries are artificial to a certain extent and capable of being erased,” Turek explained.
Much more on this topic beyond the video in a deeper piece based on another chat with Turek can be found here.
Be the first to comment