In this edition of Active Quantum Research Areas (AQRAs), we discuss recent research on barren plateaus in variational quantum algorithms.
Parametrized quantum circuits (PQCs) are employed in Quantum Neural Networks (QNNs) and Variational Quantum Algorithms (VQAs), which are typical researched algorithms that may allow for near-term quantum advantage. QNNs have been recently proposed as generalizations of classical neural networks that can potentially analyse quantum data more efficiently. VQAs have been proposed as a more resilient to noise circuit for the NISQ era.
Both algorithms may rely on gradient based optimization to iteratively minimize a cost function evaluated by measuring output(s) of a quantum processor. Training PQCs involves a hybrid quantum-classical optimization loop. Typically, the problem is encoded in a cost (or loss) function that is ideally efficient to evaluate on a quantum computer but computationally expensive for a classical one. While the quantum computer evaluates the cost, a classical optimizer updates some (usually continuous) parameters associated with the quantum operations. PQCs with fixed gate structure are often referred to as a variational ansatze.
A `barren plateau' is the phenomenon of exponentially vanishing gradients in sufficiently expressive parametrized quantum circuits and it was first coined in the context of quantum circuits in [McClean2018]. It has been established that the onset of a barren plateau regime depends on the cost function, although the particular behaviour has been demonstrated only for certain classes of cost functions. Barren plateau landscapes correspond to gradients that vanish exponentially in the number of qubits. Therefore, an exponentially large precision is needed to navigate through the landscape. Such landscapes have been demonstrated for variational quantum algorithms and quantum neural networks with either deep circuits or global cost functions. Even if one manages to avoid these barren plateaus, there is an additional difficulty of optimizing in the presence of the hardware noise that defines NISQ devices, which is expected to modify the cost landscape.
The classical optimization step, integral to these algorithms, can be implemented using either gradient-free or gradient-based methods. At first glance, the use of the former seems more effective as the evaluation of the cost function is subject to imperfection(s). However, further research developed methods of evaluating gradients analytically (i.e. without relying on finite differences), and the access to gradients was proven to speed up the convergence to local minima.
In the absence of a barren plateau, the determination of a minimizing direction in the cost function landscape does not require an exponentially large precision, meaning that one can always navigate through the landscape by measuring expectation values with a precision that grows at most polynomially with the system size. Polynomial overhead is the standard goal for quantum algorithms, which aim to achieve a speedup over classical algorithms that often scale exponentially. Hence, the absence or existence of a barren plateau can determine whether or not a quantum speedup is achievable.
Recently, particularly in the past month of November 2020, many interesting papers appeared on the arXiv describing research on these barren plateaus, which we briefly summarize here.
, the authors analyse gradient scaling for the parameters of their proposed Quantum Convolutional Neural Network (QCNN). The main result of this research is the proposition of a novel method for analyzing the scaling of the variance and lower bounding the variance for the QCNN architecture.
, the authors propose a tree tensor architecture for QNNs. The main result is the proof that tree tensor QNNs have gradients that vanish polynomially with the qubit number, irrespective of which encoding methods are employed.
, the authors examine ways to exploit the symmetries and degeneracies that exist in PQCs and investigate how hardware noise affect such symmetries. The main result is that they find an exponentially large symmetry in PQCs, yielding an exponentially large degeneracy of the minima in the cost landscape. Also, they show that noise (specifically non-unital noise) can break these symmetries and lift the degeneracy of minima, making many of them local minima instead of global minima. The main contribution is an novel optimization method, which exploits underlying symmetries in PQCs.
, the authors derive a lower bound on the variance of the gradient, which depends mainly on the width of the circuit causal cone of each term in the Pauli decomposition of the cost function. The main contribution is that this result further clarifies the conditions under which barren plateaus can occur. Given a Hamiltonian, they are able to estimate its susceptibility to the barren plateaus, therefore one can pre-process Hamiltonians in order to make the optimization more viable.
, the authors answer the question of whether only gradient based optimizers suffer from the vanishing gradient phenomenon. The main result is that gradient-free optimizers do not solve the barren plateau problem and prove that cost function differences are exponentially suppressed in a barren plateau. Hence, without exponential precision, gradient-free optimizers will not make progress in the optimization.
, the authors explore Natural Evolutionary Strategies (NES) for the optimization of randomly-initialized PQCs in the region of vanishing gradients. The main result is that using the NES gradient estimator the exponential decrease in variance can be alleviated. They compare their two approaches, namely the exponential and separable NES, against standard gradient descent and show that optimization of randomly initialized PQCs can be performed with significantly less circuit evaluations using NES, while achieving comparable accuracy to gradient based methods. They also show that NES methods can be used to amplify gradients and improve parameter initialization for gradient-based approaches.
It is clear from the ongoing research that the vanishing gradient problem is substantial when working with PQCs, which in turn seems to be a crucial aspect for proving quantum advantage in the NISQ era. Although different approaches have been suggested to overcome the limitations arising from barren plateaus, there is not one simple way to do it. Most of the strategies discussed, focus on the near term heuristic solutions that can be obtained from PQCs, with the main goal being a polynomial instead of an exponential overhead with the number of qubits for the precision of the cost function, which is a way one can prove quantum advantage. This is done by employing new optimization strategies, optimizers, re-design of the Hamiltonian/ansatz, exploitation of symmetries and many other ways. Researchers in this field will likely keep inventing new ways to avoid or mitigate barren plateaus for the foreseeable future, and these recent developments have ushered in a new era of understanding of this domain.
Natural Evolutionary Strategies for Variational Quantum Computation, by Abhinav Anand, Matthias Degroote and Alán Aspuru-Guzik, arXiv:2012.00101[Arrasmith2020]
Effect of barren plateaus on gradient-free optimization, by Andrew Arrasmith, M. Cerezo, Piotr Czarnik, Lukasz Cincio, Patrick J. Coles, arXiv:2011.12245[Fontanta2020]
Optimizing parametrized quantum circuits via noise-induced breaking of symmetries, Enrico Fontana, M. Cerezo, Andrew Arrasmith, Ivan Rungger, Patrick J. Coles, arXiv:2011.08763[Pesah2020]
Absence of Barren Plateaus in Quantum Convolutional Neural Networks, by Arthur Pesah, M. Cerezo, Samson Wang, Tyler Volkoff, Andrew T. Sornborger and Patrick J. Coles, arXiv:2011.02966
[McClean2018] Barren plateaus in quantum neural network training landscapes, by McClean, J.R., Boixo, S., Smelyanskiy, V.N., R. Babbush and H. Neven, Nat Commun 9, 4812 (2018)[Uvarov2020]
On barren plateaus and cost function locality in variational quantum algorithms, by Alexey Uvarov and Jacob Biamonte, arXiv:2011.10530[Zhang2020]
Toward Trainability of Quantum Neural Networks, by Kaining Zhang, Min-Hsiu Hsieh, Liu Liu and Dacheng Tao, arXiv:2011.06258