Quantum computers will continue to fit in the “emerging technologies” category for some time, at least in terms of their ability to handle a large enough number of real-world applications. This lack of technology readiness is a shame, however, given the massive computational efficiencies quantum systems promise.
Even at this early stage in the quantum game, one workload that has been of special interest for quantum computers has been in AI training, a compute-intensive task when done at scale which often requires multi-GPU laden nodes for training sets with many thousands or even millions of examples.
Quantum machine learning is a vastly differentiated field, but for future AI, one area of particular area of interest is using a quantum annealing system like the D-Wave machine for training Boltzmann machines and other neural network models. The work is promising in its efficiency compared to classical approaches, but as one might imagine, simply mapping the problem to the quantum annealing device is a challenge. The D-Wave 2X at NASA Ames has been used to train Boltzmann machines and neural networks and some similar work has been done to generate and train datasets for handwritten characters. There is still an absence of overwhelming examples, particularly with Boltzmann machines, but this might be changing.
To get a better understanding of current research trends in quantum machine learning we spoke with University of Michigan researcher, Dr. Veera Sundararaghavan about where quantum and machine learning intersect, at least for the potentially high-value Boltzmann machine approaches. His research area is in aerospace engineering with an emphasis on computational methods that can revolutionize computations in materials science, most notably machine learning and quantum algorithms.
TNP: We are qubit/connectivity limited in quantum for the near future but what do you see on the horizon in terms of increased capability for quantum computers to handle AI training?
Sundararaghavan: We are qubit-limited, so in the near-term quantum-classical hybrid algorithms are critical. These algorithms use classical algorithms on traditional computers for the most part and switch to using quantum computers at bottlenecks where quantum computing can outperform classical computers. In our paper. we take advantage of quantum annealers to speed-up sampling from Boltzmann machines which is classically hard. The classical algorithm takes care of performing tasks such as optimization of quantum hardware parameters. The goal of these algorithms is to identify low-energy states, which is a hard problem for classical computers when the state contains a large number of variables since the number of possible states grows exponentially (2N)
TNP: What kind of training workloads will be best suited to quantum systems (we’ll discuss annealing vs. gate model in a moment)?
Sundararaghavan: Early machine learning algorithms on quantum computers will take advantage of the ability of quantum computers to sample complex probability distributions. This sampling step is essential in generative model development and classical algorithms will either fail or take very long times when the probability distributions to be learned are of complex multi-modal type.
A typical quantum training algorithm will devise a probabilistic model (called a Hamiltonian) that when continually sampled, returns a larger number of `correct’ results than `incorrect’ results. In the limited-qubit scenario, early use-cases of such quantum algorithms will rely on some form of data reduction to reduce the number of input variables (due to the small number of available qubits). For example, they will work on a small set of features of the inputs (eg. latent variables) to make decisions. These features can be codified using classical neural nets such as autoencoders.
TNP: What are some real-world environments/use cases for discriminative tasks involving Boltzmann machines?
Sundararaghavan: Real-world use-cases for discriminative tasks will involve problems where a rapid classification needs to be made, typically in military applications. Which is why the defense industry is interested in quantum computers. As a simple example, if an image from a battlefield is provided that needs to be classified as a `threat’ or a `friend’, the training takes in pixelized inputs and returns a probability of the image being a threat or a friend. The probability is defined using a large number of states (in this case, the number of pixels in the image). This is a typical classification problem, although is treated in a probabilistic sense by the quantum computer. A model is developed via training (in other words, identifying the quantum circuit parameters) on a known set of images. The training starts with a randomized set of circuit parameters and the training algorithm will change the circuit parameters such that the correct output is realized on known images. The modification of circuit parameters is done by a classical (GPU based) algorithm. This is termed a `discriminative task’ in supervised learning.
Other military applications include mission planning under uncertainties where quantum-enabled Bayesian inference can improve multi-vehicle adversary detection. There is also interest in using quantum algorithms for cybersecurity threat detection based on feature sets including threat and data origin, characteristics, and observed dynamics.
TNP: Are the use cases for generative models much different, at least on quantum systems?
Sundararaghavan: Generative modeling allows the discovery of knowledge and is a strength for quantum computing algorithms. Generative modeling entails the development of probabilistic models of higher complexity than discriminative models. These models capture the probability of the training data itself and thus, can generate new datasets on its own.
Striking examples are the deep fake videos or fake faces generated that we increasingly see. Early uses of generative models via quantum computers will be in material design (chemistry) problems where it is of interest to learn wavefunctions, here quantum samplers have advantages in handling the exponential growth of states with the number of electrons that I explained before.
Discovering knowledge from high-dimensional sensor data will be of interest in military control and autonomy tasks, including path planning and environment perception in unmanned air vehicles. An example is the physics-assisted generation of possible drone flight trajectories using noisy sensor data. We are currently working on quantum assisted deep learning for parameter estimation in differential equations which is related to this example.
TNP: This works for annealing-based quantum systems, what about gate model quantum computers?
Sundararaghavan: Gate model provides for universal computation so any problem that can be solved in annealers can be embedded in a gate model too. If one were to represent Boltzmann machines in the gate model, quantum approximate optimization algorithm (QAOA) could be used to train circuit parameters. The strength of the annealer model is the availability of a large number of qubits so input data can be better represented. The gate-model also has limitations concerning coherence and circuit depth. I believe these issues will be eventually overcome.
TNP: How do annealing systems like D-Wave’s handle these computations?
Sundararaghavan: Recent advances in machine learning strongly suggest that building AI systems require deep architectures with multi-layer processing that mimics the brain. However, training such networks has been a challenge. We propose the use of a quantum annealer to significantly boost the speed and efficiency of training Deep Boltzmann Machines (DBMs). I will explain what deep Boltzmann machine is next, but here the thing to note is that there is no classical algorithm that can efficiently sample general Boltzmann machines. On the other hand, a Quantum annealer is a physical Boltzmann machine. It is natural to directly sample from an experimental platform rather than develop a classical sampling algorithm! In the discriminative task, each training set may contain an image input and the classification output (eg. threat or friend). The algorithm determines the annealer hardware parameters (biases and coupling) so that samples obtained eventually favor the correct outcome over the incorrect outcome for each training case.
Here is how the algorithm progresses: start from a randomized set of hardware parameters, enforce the training data on some qubits and sample the result on another qubit several times. (Some qubits are in the network but remain hidden.) Tabulate the probability of the output classically. Do this for all the training data. Then, run a gradient optimizer that uses the tabulated probabilities to identify a new set of hardware parameters that can give an even better result. Eventually, these steps are repeated until all the training data provide the correct outputs. Now the circuit is ready to predict the output for any new data.
A similar approach on a gate-based system can be done. In such cases, the input data is read into a quantum circuit (with several gates, such as CNOT, SWAP, etc embedding the hamiltonian). Such a circuit can perform similar sampling. However, this approach is currently limited due to limited qubits and decoherence limiting circuit depth.
TNP: And how do these approaches differ from using, for example, clusters of GPUs for accelerated training? What’s the complexity overhead, even with the relative few qubits we have to work with?
Sundararaghavan: A cluster of GPUs is still used in this algorithm, but for all tasks except for the sampling itself. An efficient classical algorithm for sampling restricted Boltzmann machines is the contrastive divergence (CD) method. However, parallelism cannot be used in the CD method to increase accuracy. As for complexity overhead: sampling on full Boltzmann machines is NP-hard and no efficient algorithms exist! If the limited number of qubits is overcome, it is thus not hard to see quantum computers replacing state of the art classical techniques for Boltzmann machine AI training. I would imagine this being the first AI application of quantum computers.
More on this can be found by reading a recent paper by Sundararaghavan and colleagues, “Machine Learning in Quantum Computers via General Boltzmann Machines: Generative and Discriminative Training Through Annealing” here.
Thanks for the writeup and interview. I was looking for articles on ML and quantum at the high level like this. It was very educational.