月, 30 8月 2021 09:41
##
Quantum Alphatron

Quantum machine learning has recently been one of the prominent areas in the field of quantum computing where quantum algorithms can be generalized to solve important problems. For specific problems like search and integer factoring, quantum algorithms can be generalized to important subroutines replacing linear algebra and machine learning. More accurately, the quantum search can be generalized to amplitude amplification allowing speedups for sampling and estimation tasks. Quantum factoring contains the phase estimation subroutine, which can be used for decomposing a matrix into eigenvalues and eigenvectors. Also, quantum kernel methods and quantum gradient computation are being actively researched for various machine learning applications.

Many quantum algorithms use heuristic methods similar to classical machine leaning. In the classical regime, it is difficult to obtain provable guarantees regarding the quality of learning and the run time. However, in quantum regimes, some algorithms retain the benefits of provable classical algorithms for learning and at the same time examine provable quantum speedups for machine learning. One such quantum algorithm that is inspired by its classical analogue is presented in this paper. It is called Alphatron and the classical algorithm is enhanced by quantum sub-routines. It is a gradient-descent-like algorithm for isotonic regression with the inclusion of kernel functions. This algorithm can be used to learn a kernelized, non-linear concept class of functions with a bounded noise term, hence can be exploited to learn a two-layer neural network, where one layer of activation functions feeds into a single activation function. In this work, the authors provide quantum speedups for the Alphatron and its application to non-linear classes of functions and two-layer neural networks.

The work considers pre-computation of the kernel matrix used in the algorithm and provides three sets of algorithms: Alphatron with pre-computation (Pre), approximate pre-computation (Approx Pre), and quantum estimation pre-computation (Q-Pre). Each algorithm offers improvement on the run-time complexity by pre-computation and quantum estimation, eventually improving the original Alphatron. The algorithms were tested against 'n' dimensional data, which is comparatively larger than the other parameters, thus showing relevance to many practical applications. With the aid of pre-computation, the complexity term which is quadratic in the total size of the dataset (m) and linear in "n" dimensions of data, loses a factor of T (number of iterations), therefore improving the performance. Furthermore, utilizing quantum amplitude estimation, a gain of quadratic speedup can be obtained along the dimension.

The work further exploits a setting where the samples are provided via quantum query access in order to harness quantum subroutines to estimate the entries of the kernel matrices used in the algorithm. This access can be provided by quantum random access memory (QRAM) via the GroverRudolph procedure. Such a device stores the data in (classical) memory cells but allows for superposition queries to the data. The cost of the resources is proportional to the length of the vector to set up, but then can provide a run-time of the superposition state that is logarithmic in the length of the vector.

The quantum subroutines used in this work are adaptations of the amplitude estimation algorithm. The results show that the learning guarantees can be preserved despite the erroneous estimation of the kernel matrices. Also, there are estimations inside the main loop of the algorithm which can be replaced with quantum subroutines while keeping the main loop intact. Moreover, in the case where data dimension 'n' is much larger than the other parameters, quantum pre-computation requires asymptotically more time than Alphatron with Kernel, since the same Alphatron with Kernel algorithm is used after the preparation of the kernel matrices. Therefore, optimizing Alphatron with Kernel is not cost-effective when the cost of preparing the data of size n is taken into account. Finally, further exploration of a potential quantum speedup is required for Alphatron with Kernel, especially for the case of pre-computation occurring prior to the algorithm.

Many quantum algorithms use heuristic methods similar to classical machine leaning. In the classical regime, it is difficult to obtain provable guarantees regarding the quality of learning and the run time. However, in quantum regimes, some algorithms retain the benefits of provable classical algorithms for learning and at the same time examine provable quantum speedups for machine learning. One such quantum algorithm that is inspired by its classical analogue is presented in this paper. It is called Alphatron and the classical algorithm is enhanced by quantum sub-routines. It is a gradient-descent-like algorithm for isotonic regression with the inclusion of kernel functions. This algorithm can be used to learn a kernelized, non-linear concept class of functions with a bounded noise term, hence can be exploited to learn a two-layer neural network, where one layer of activation functions feeds into a single activation function. In this work, the authors provide quantum speedups for the Alphatron and its application to non-linear classes of functions and two-layer neural networks.

The work considers pre-computation of the kernel matrix used in the algorithm and provides three sets of algorithms: Alphatron with pre-computation (Pre), approximate pre-computation (Approx Pre), and quantum estimation pre-computation (Q-Pre). Each algorithm offers improvement on the run-time complexity by pre-computation and quantum estimation, eventually improving the original Alphatron. The algorithms were tested against 'n' dimensional data, which is comparatively larger than the other parameters, thus showing relevance to many practical applications. With the aid of pre-computation, the complexity term which is quadratic in the total size of the dataset (m) and linear in "n" dimensions of data, loses a factor of T (number of iterations), therefore improving the performance. Furthermore, utilizing quantum amplitude estimation, a gain of quadratic speedup can be obtained along the dimension.

The work further exploits a setting where the samples are provided via quantum query access in order to harness quantum subroutines to estimate the entries of the kernel matrices used in the algorithm. This access can be provided by quantum random access memory (QRAM) via the GroverRudolph procedure. Such a device stores the data in (classical) memory cells but allows for superposition queries to the data. The cost of the resources is proportional to the length of the vector to set up, but then can provide a run-time of the superposition state that is logarithmic in the length of the vector.

The quantum subroutines used in this work are adaptations of the amplitude estimation algorithm. The results show that the learning guarantees can be preserved despite the erroneous estimation of the kernel matrices. Also, there are estimations inside the main loop of the algorithm which can be replaced with quantum subroutines while keeping the main loop intact. Moreover, in the case where data dimension 'n' is much larger than the other parameters, quantum pre-computation requires asymptotically more time than Alphatron with Kernel, since the same Alphatron with Kernel algorithm is used after the preparation of the kernel matrices. Therefore, optimizing Alphatron with Kernel is not cost-effective when the cost of preparing the data of size n is taken into account. Finally, further exploration of a potential quantum speedup is required for Alphatron with Kernel, especially for the case of pre-computation occurring prior to the algorithm.

Published in
Blog

Tagged under

日, 31 1月 2021 12:00
##
Active Quantum Research Areas: Quantum Machine Learning

In this edition of Active Quantum Research Areas (AQRAs), we highlight recent research in the quantum community on quantum machine learning; in particular, we focus here on training, expressibility and potential for quantum advantage.

Quantum Machine Learning (QML) techniques have been rigorously researched in the past few years, showing great promise in a number of applications and using various quantum algorithmic techniques and potential speedups. However, the effectiveness of quantum- over classical machine learning is not always evident and still heavily being investigated. In this blog post, we briefly summarize and contextualize several recent arXiv contributions in the area of quantum machine learning, quantum neural networks, quantum generative adversarial networks, and training their variational quantum circuits.

Research, such as the one presented in [1], suggests that potentially exponential quantum advantage is possible for certain tasks when the goal is to achieve accurate prediction on all inputs. While training techniques differ from the classical machine learning realm, more focus is on developing new models and algorithms to exploit the quantum mechanical phenomena. For instance, hybrid quantum-classical QML protocols can offer a significantly enhanced potential. Similarly, features like the expressivity of quantum neural networks can be exploited to gain quantum advantage. More accurately, the relationship between expressibillity and trainability of highly expressive ansatze states can provide useful insight into QML.

It was shown that the potential advantage of quantum ML over classical ML is limited, in the specific case of predicting outcomes of quantum experiments with a specified average-case prediction error. The recent findings by Huang et al. [1] establish the fact that quantum ML can have an exponential advantage over classical ML for certain problems where the objective is achieving a specified worst-case prediction error. This work claims that for any input distribution, a classical ML model can provide accurate predictions on average by accessing a quantum process a number of times comparable to the optimal quantum ML model. These results and analysis complement recent studies of the computational complexity of classical ML models trained with data from quantum experiments. Together, these rigorous results indicate that classical ML trained models using quantum measurement data can be surprisingly powerful. Such a hybrid protocol can potentially address challenging quantum problems in physics, chemistry, and materials for near-term experimental platforms.

Another direction being investigated is exploration towards establishing a quantum neural network (QNN) analogy of the universal approximation theorem, which attempts to provide the foundation for expressivity of classical neural networks. Recent work by Wu Y. et al. [2] discusses the generalization of expressivity of QNN for learning targets (for example observables of input wave functions), while focusing on fully connected architectures. The findings suggest that an accurate approximation is possible only when the input wave functions (target function) in the dataset do not exhaust the entire Hilbert space that the quantum circuit acts on, and more precisely, the Hilbert space dimension of the former has to be less than half of the Hilbert space dimension of the latter. If this requirement cannot be satisfied by the dataset, the expressivity capabilities can be restored by adding one ancillary qubit where the wave function is always fixed at input. Hybrid quantum-classical QML protocols with mixed target states can be exploited in already available experimental platforms with potential promising applications in quantum computing and noise sensing.

Another aspect of quantum ML that is being explored is quantum generative adversarial learning (QGAN) which is a promising strategy to use quantum devices for (quantum) estimation or generative machine learning tasks. The recent work by Braccia et al. [3], discusses the convergence behaviors of a QGAN training process and proposes new algorithms for QGAN training for mixed states which are expected to work better than previously used techniques. It is observed that the states obtained from currently available NISQ devices may display a “chaotic” behaviour during training, without achieving convergence. Results from this work show that algorithms based on the adaptation of a gradient-based technique allow provable convergence with bilinear score functions while the algorithms based on convex optimization techniques are shown to be especially suited for non-parametric states that are iteratively updated via a quantum circuit. Also, endowing the scheme with the ability to process entangled copies of the target state results in enhanced performance.

In Ref. [4] it is shown how generative models can be enhanced via quantum correlations, possibly paving the way to a quantum advantage in a generative ML setting. They show how quantum correlations offer a powerful resource, using a numerical example on a practical problem, and even prove a separation in expressive power between a particular classical generative model and its quantum counterpart. The separation can be explained in terms of the quantum nonlocality and quantum contextuality. Furthermore, there may be classical ML improvements drawing from QI foundations (also known as ‘quantum inspired’ research, for example using MPS or TN). As an outlook, the authors show how their specific proof and example can be extended to more general model paradigms. It would be great to see practical implementations of generative modeling advantage on hardware in the coming years.

In variational quantum algorithms, it is known that highly expressive ansatze can access a close approximation of a desired solution to a number of variational problems and provide a flexible paradigm for programming near-term quantum computers. However, the circuit ansatz must have sufficiently large gradients to allow for proper training; much has been researched about the 'barren plateau' phenomenon in this context and it is seen as a big challenge observed in variational QML and QGAN. Holmes et al. [5] attempts to derive a fundamental relationship between two essential ansatze properties: expressibility and trainability. The resulting bounds in the work by Holmes et al. [5] indicates that highly expressive anatze exhibit flatter cost landscapes and therefore will be harder to train. Therefore, on one hand making the quantum neural network more expressive leads to better representation of a function. However, on the other hand, larger expressivity leads to smaller gradients, which lead to the barren plateau problem and convergence issues. Strategies such as using structured ansatze, physics informed circuits, correlating parameters and restricting rotation angles have been proposed to avoid or mitigate barren plateaus. Further exploring these and other strategies will be an important direction for future research.

Also recently, the connection between quantum machine learning models and kernel methods was explored, more rigorously than previously. Oftentimes, a variational quantum circuit in QML is compared to a neural network (and aptly named a ‘QNN’). Such circuits typically consist of encoding circuit, a variational ansatz, and a measurement. But conceptually, these quantum models really only consist of two parts: the data encoding/embedding and the measurement. Training a quantum model is then simply the problem of finding the measurement that minimises a data-dependent cost function. The author in [6] argues such quantum models are typically much more related to kernel methods than NN’s. Besides being conceptually very interesting, this view also yields several interesting benefits: for example, it refocuses the attention of a potential quantum advantage over classical methods to the way data is encoded into quantum states. Also, kernel-based training guarantees to find the global optimum, which is typically hard for variational circuit training. It is also shown that the main drawback of kernel-based methods is the same reason why NN methods superseded kernel methods in Big Data applications: while NN training is theoretically very hard, empirically O(N) training time can yield good results, while kernel training requires a runtime of at least O(N^2) in a classical feedback loop, even though the training is convex (‘efficient’). However, the model could also be trained using quantum algorithms, and is known to have a quadratic speedup to O(N), although such protocols require large fault-tolerant quantum hardware.

Quantum machine learning is a promising candidate for quantum advantage and recent results indicate the possibility of an advantage for certain tasks. However, this is still ongoing research with various aspects that need to be investigated. A number of algorithms are proposed claiming enhanced performance in numerics but in particular the training and learning problems are essential to explore under future scopes. Similarly, further generalizations on learning targets, regression, and classification tasks are required for implementation on experimental platforms.

References:

[1] Huang H. Y., Kueng R., J. Preskill, “Information-theoretic bounds on quantum advantage in machine learning” arXiv:2101.02464 (Jan. 2021)

[2] Y. Wu, J. Yao, P. Zhang, H. Zhai, “Expressivity of Quantum Neural Networks” arXiv:2101.04273 (Jan. 2021)

[3] P. Braccia, F. Caruso, L. Banchi, “How to enhance quantum generative adversarial learning of noisy information” arXiv:2012.05996 (Dec. 2020)

[4] Xun Gao, Eric R. Anschuetz, Sheng-Tao Wang, J. Ignacio Cirac, Mikhail D. Lukin, “Enhancing Generative Models via Quantum Correlations” arXiv:2101.08354 (Jan. 2021)

[5] Z.Holmes, K. Sharma, M. Cerezo, P.J. Coles, “Connecting ansatz expressibility to gradient magnitudes and barren plateaus” arXiv:2101.02138 (Jan. 2021)

[6] Maria Schuld, “Quantum Machine Learning Models are kernel methods” arXiv:2101.11020 (Jan. 2021)

Quantum Machine Learning (QML) techniques have been rigorously researched in the past few years, showing great promise in a number of applications and using various quantum algorithmic techniques and potential speedups. However, the effectiveness of quantum- over classical machine learning is not always evident and still heavily being investigated. In this blog post, we briefly summarize and contextualize several recent arXiv contributions in the area of quantum machine learning, quantum neural networks, quantum generative adversarial networks, and training their variational quantum circuits.

Research, such as the one presented in [1], suggests that potentially exponential quantum advantage is possible for certain tasks when the goal is to achieve accurate prediction on all inputs. While training techniques differ from the classical machine learning realm, more focus is on developing new models and algorithms to exploit the quantum mechanical phenomena. For instance, hybrid quantum-classical QML protocols can offer a significantly enhanced potential. Similarly, features like the expressivity of quantum neural networks can be exploited to gain quantum advantage. More accurately, the relationship between expressibillity and trainability of highly expressive ansatze states can provide useful insight into QML.

It was shown that the potential advantage of quantum ML over classical ML is limited, in the specific case of predicting outcomes of quantum experiments with a specified average-case prediction error. The recent findings by Huang et al. [1] establish the fact that quantum ML can have an exponential advantage over classical ML for certain problems where the objective is achieving a specified worst-case prediction error. This work claims that for any input distribution, a classical ML model can provide accurate predictions on average by accessing a quantum process a number of times comparable to the optimal quantum ML model. These results and analysis complement recent studies of the computational complexity of classical ML models trained with data from quantum experiments. Together, these rigorous results indicate that classical ML trained models using quantum measurement data can be surprisingly powerful. Such a hybrid protocol can potentially address challenging quantum problems in physics, chemistry, and materials for near-term experimental platforms.

Another direction being investigated is exploration towards establishing a quantum neural network (QNN) analogy of the universal approximation theorem, which attempts to provide the foundation for expressivity of classical neural networks. Recent work by Wu Y. et al. [2] discusses the generalization of expressivity of QNN for learning targets (for example observables of input wave functions), while focusing on fully connected architectures. The findings suggest that an accurate approximation is possible only when the input wave functions (target function) in the dataset do not exhaust the entire Hilbert space that the quantum circuit acts on, and more precisely, the Hilbert space dimension of the former has to be less than half of the Hilbert space dimension of the latter. If this requirement cannot be satisfied by the dataset, the expressivity capabilities can be restored by adding one ancillary qubit where the wave function is always fixed at input. Hybrid quantum-classical QML protocols with mixed target states can be exploited in already available experimental platforms with potential promising applications in quantum computing and noise sensing.

Another aspect of quantum ML that is being explored is quantum generative adversarial learning (QGAN) which is a promising strategy to use quantum devices for (quantum) estimation or generative machine learning tasks. The recent work by Braccia et al. [3], discusses the convergence behaviors of a QGAN training process and proposes new algorithms for QGAN training for mixed states which are expected to work better than previously used techniques. It is observed that the states obtained from currently available NISQ devices may display a “chaotic” behaviour during training, without achieving convergence. Results from this work show that algorithms based on the adaptation of a gradient-based technique allow provable convergence with bilinear score functions while the algorithms based on convex optimization techniques are shown to be especially suited for non-parametric states that are iteratively updated via a quantum circuit. Also, endowing the scheme with the ability to process entangled copies of the target state results in enhanced performance.

In Ref. [4] it is shown how generative models can be enhanced via quantum correlations, possibly paving the way to a quantum advantage in a generative ML setting. They show how quantum correlations offer a powerful resource, using a numerical example on a practical problem, and even prove a separation in expressive power between a particular classical generative model and its quantum counterpart. The separation can be explained in terms of the quantum nonlocality and quantum contextuality. Furthermore, there may be classical ML improvements drawing from QI foundations (also known as ‘quantum inspired’ research, for example using MPS or TN). As an outlook, the authors show how their specific proof and example can be extended to more general model paradigms. It would be great to see practical implementations of generative modeling advantage on hardware in the coming years.

In variational quantum algorithms, it is known that highly expressive ansatze can access a close approximation of a desired solution to a number of variational problems and provide a flexible paradigm for programming near-term quantum computers. However, the circuit ansatz must have sufficiently large gradients to allow for proper training; much has been researched about the 'barren plateau' phenomenon in this context and it is seen as a big challenge observed in variational QML and QGAN. Holmes et al. [5] attempts to derive a fundamental relationship between two essential ansatze properties: expressibility and trainability. The resulting bounds in the work by Holmes et al. [5] indicates that highly expressive anatze exhibit flatter cost landscapes and therefore will be harder to train. Therefore, on one hand making the quantum neural network more expressive leads to better representation of a function. However, on the other hand, larger expressivity leads to smaller gradients, which lead to the barren plateau problem and convergence issues. Strategies such as using structured ansatze, physics informed circuits, correlating parameters and restricting rotation angles have been proposed to avoid or mitigate barren plateaus. Further exploring these and other strategies will be an important direction for future research.

Also recently, the connection between quantum machine learning models and kernel methods was explored, more rigorously than previously. Oftentimes, a variational quantum circuit in QML is compared to a neural network (and aptly named a ‘QNN’). Such circuits typically consist of encoding circuit, a variational ansatz, and a measurement. But conceptually, these quantum models really only consist of two parts: the data encoding/embedding and the measurement. Training a quantum model is then simply the problem of finding the measurement that minimises a data-dependent cost function. The author in [6] argues such quantum models are typically much more related to kernel methods than NN’s. Besides being conceptually very interesting, this view also yields several interesting benefits: for example, it refocuses the attention of a potential quantum advantage over classical methods to the way data is encoded into quantum states. Also, kernel-based training guarantees to find the global optimum, which is typically hard for variational circuit training. It is also shown that the main drawback of kernel-based methods is the same reason why NN methods superseded kernel methods in Big Data applications: while NN training is theoretically very hard, empirically O(N) training time can yield good results, while kernel training requires a runtime of at least O(N^2) in a classical feedback loop, even though the training is convex (‘efficient’). However, the model could also be trained using quantum algorithms, and is known to have a quadratic speedup to O(N), although such protocols require large fault-tolerant quantum hardware.

Quantum machine learning is a promising candidate for quantum advantage and recent results indicate the possibility of an advantage for certain tasks. However, this is still ongoing research with various aspects that need to be investigated. A number of algorithms are proposed claiming enhanced performance in numerics but in particular the training and learning problems are essential to explore under future scopes. Similarly, further generalizations on learning targets, regression, and classification tasks are required for implementation on experimental platforms.

References:

[1] Huang H. Y., Kueng R., J. Preskill, “Information-theoretic bounds on quantum advantage in machine learning” arXiv:2101.02464 (Jan. 2021)

[2] Y. Wu, J. Yao, P. Zhang, H. Zhai, “Expressivity of Quantum Neural Networks” arXiv:2101.04273 (Jan. 2021)

[3] P. Braccia, F. Caruso, L. Banchi, “How to enhance quantum generative adversarial learning of noisy information” arXiv:2012.05996 (Dec. 2020)

[4] Xun Gao, Eric R. Anschuetz, Sheng-Tao Wang, J. Ignacio Cirac, Mikhail D. Lukin, “Enhancing Generative Models via Quantum Correlations” arXiv:2101.08354 (Jan. 2021)

[5] Z.Holmes, K. Sharma, M. Cerezo, P.J. Coles, “Connecting ansatz expressibility to gradient magnitudes and barren plateaus” arXiv:2101.02138 (Jan. 2021)

[6] Maria Schuld, “Quantum Machine Learning Models are kernel methods” arXiv:2101.11020 (Jan. 2021)

Published in
Blog

Tagged under

以下のお問合せフォームからご連絡ください。

Copyright © Qu & Co