For the Noisy Intermediate Scale Quantum (NISQ) era, in the absence of large-scale quantum error correction, the number of gates that can be applied while maintaining computational coherence is at present strongly limited by hardware noise and decoherence. In an attempt to alleviate some of the detrimental effects, current generations of quantum algorithms often rely on a hybrid classical-quantum approach. Such approaches consider a trial quantum state (ansatz state) with a tractable number of parameters and relatively short circuit depth. These parameters are then optimized in order to approximate a target state as accurately as possible. In most of such applications shown hitherto, the target state was variationally optimized to represent a lowest-energy eigenstate (groundstate) of some quantum Hamiltonian.

However, one can also envision simulating unitary time evolution (or ‘dynamics’) with such variational algorithms. The authors of today’s paper first reference the Time-Dependent Variational Algorithm (TDVA) which encodes the state into a variational circuit and iteratively updates the parameters by solving the corresponding equation of motion. However, a significant drawback of that existing algorithm is that it suffers from an expensive quadratic cost in the total number of variational parameters.

To tackle this problem, the authors in this work introduce a novel hybrid algorithm to simulate the real-time evolution of quantum systems using parameterized quantum circuits. They propose a new method called "projected - Variational Quantum Dynamics" (p-VQD) which realizes an iterative, global projection of the exact time evolution onto the parameterized manifold. In the small time-step limit, this approach is equivalent to the McLachlan’s variational principle and uses it to optimize all parameters at once. This algorithm is shown to overcome the drawbacks of existing approaches as it is both global – it optimizes all parameters at once – and efficient – it scales linearly with the number of parameters. Moreover, it does not require auxiliary (ancilla) qubits and the depth of the circuit is constant throughout all the simulation. They use circuit differentiation to compute gradients analytically and use them for gradient descent optimization.

This global approach potentially extends the scope of existing efficient variational methods, that instead typically rely on the iterative optimization of a restricted subset of variational parameters. The authors claim that this approach can be particularly advantageous over existing global optimization algorithms based on the time-dependent variational principle.

They have shown that the algorithm is asymptotically more hardware efficient than the standard variational algorithm while retaining a higher accuracy. Currently a drawback of this method is that the circuit constructed on the quantum device is approximately twice as deep as the ansatz used to represent the system. However, by suitably controlling the number of two-qubit gates in the particular ansatz of choice, the authors comment that p-VQD is already implementable to simulate small quantum systems on available devices.

One possible application of the approach used in this work is to study the dynamical properties of two-dimensional interacting systems which is a notoriously difficult problem for classical computation. Similar to all other variational algorithms, the choice of the right parametrization is fundamental for the algorithm to succeed. In this sense, having an efficient quantum algorithm to perform variational time evolution is essential to compare to classical results obtained with variational states based on tensor or neural networks.
Recently there has been increased efforts to build large-scale quantum computers for solving certain types of hard computational problems. These efforts are mainly motivated by the prospect of enabling quantum algorithms with a quadratic, polynomial or potentially exponential speedup. When the size of the problem is sufficiently large, this scaling advantage implies that a quantum computer will outperform its classical counterpart, independently of the time it takes to execute a single gate. However, for any real-world application, not only the scaling but also the total computation time will be of importance, hence the realization of faster gate operations becomes a necessity to further improve the fidelity of the computation.

In the work we highlight today, the authors discuss the realization of a universal set of ultrafast single- and two-qubit operations with superconducting quantum circuits along with investigating the physical and technical limitations for achieving faster gates. The work establishes a fundamental bound on the minimal gate time, which depends on the qubit nonlinearity and bandwidth of the control pulse over a large parameter range, being independent of the qubit design. The numerical results suggest that for highly anharmonic flux qubits and commercially available control electronics, elementary single- and two-qubit operations can be implemented in about 100 picoseconds with residual gate errors below 10-4. Under the same conditions, authors estimate that the complete execution of a compressed version of Shor’s algorithm for factoring the number 15 would take about one nanosecond on such a hypothetical device.

The numerical results claim that there exists a lower bound for both single-qubit gates and two-qubit gates, which holds without the often assumed three-level approximation for the whole range of qubit parameters explored in this work. For very fast gates in the range of hundred picoseconds, additional limitations arise from the finite qubit oscillation time.
The authors also addressed the implementation of larger quantum circuits composed out of many ultrafast gates. A full multi-level simulation of a basic three-qubit circuit consisting of eleven elementary single- and two-qubit gates is performed taking the finite qubit rotation time into account which introduces a natural cycle time according to which gates must be clocked. For realistic qubit nonlinearities and control bandwidths, the simulated execution times for the whole circuit are observed to be about 1-2 ns, which is about two orders of magnitude faster than what is achievable in most superconducting quantum computing experiments today. The results demonstrate that significant improvements in this direction are still possible.

In their analysis, the authors have restricted the gate times down to about 50 picoseconds, which requires absolute nonlinearities and larger control bandwidth than contemporary state-of-the-art experiments. Although such parameters are highly non-standard for current superconducting qubit experiments, they are still within physical and technological bounds. For the implementation of even faster gates, additional physical constraints will come into play and also the applicability of the usual effective circuit model must be re-evaluated. For example, at a certain value, the energy of the third circuit level is comparable to twice the superconducting gap of aluminum. Any components of the control field above this frequency will excite quasiparticles and strongly degrade the qubit coherence. However, in other materials the superconducting gap can be substantially higher, which suggests that at least in principle, gate times in the range of 1-10 picoseconds could become accessible in the future.

These results demonstrate that, compared to state-of-the-art implementations with transmon qubits, a hundredfold increase in the speed of gate operations with superconducting circuits is still feasible. Despite its long-term relevance, the implementation of quantum gates in the picosecond regime still remains highly unexplored and the ultimate limit for the speed of superconducting quantum processors is still a research question to be addressed with a hope for pushing towards faster gates. In such a scenario, decoherence will be a less limiting factor, since gates will take less time to be applied. Furthermore, algorithms that require a large circuit depth will be able to be implemented, allowing researchers to solve more complex problems. Finally, processes like quantum error correction and decoding of errors will be much easier to implement, therefore taking us beyond the NISQ era.
In this edition of Active Quantum Research Areas (AQRAs), we highlight recent research in the quantum community on quantum machine learning; in particular, we focus here on training, expressibility and potential for quantum advantage.

Quantum Machine Learning (QML) techniques have been rigorously researched in the past few years, showing great promise in a number of applications and using various quantum algorithmic techniques and potential speedups. However, the effectiveness of quantum- over classical machine learning is not always evident and still heavily being investigated. In this blog post, we briefly summarize and contextualize several recent arXiv contributions in the area of quantum machine learning, quantum neural networks, quantum generative adversarial networks, and training their variational quantum circuits.

Research, such as the one presented in [1], suggests that potentially exponential quantum advantage is possible for certain tasks when the goal is to achieve accurate prediction on all inputs. While training techniques differ from the classical machine learning realm, more focus is on developing new models and algorithms to exploit the quantum mechanical phenomena. For instance, hybrid quantum-classical QML protocols can offer a significantly enhanced potential. Similarly, features like the expressivity of quantum neural networks can be exploited to gain quantum advantage. More accurately, the relationship between expressibillity and trainability of highly expressive ansatze states can provide useful insight into QML.

It was shown that the potential advantage of quantum ML over classical ML is limited, in the specific case of predicting outcomes of quantum experiments with a specified average-case prediction error. The recent findings by Huang et al. [1] establish the fact that quantum ML can have an exponential advantage over classical ML for certain problems where the objective is achieving a specified worst-case prediction error. This work claims that for any input distribution, a classical ML model can provide accurate predictions on average by accessing a quantum process a number of times comparable to the optimal quantum ML model. These results and analysis complement recent studies of the computational complexity of classical ML models trained with data from quantum experiments. Together, these rigorous results indicate that classical ML trained models using quantum measurement data can be surprisingly powerful. Such a hybrid protocol can potentially address challenging quantum problems in physics, chemistry, and materials for near-term experimental platforms.

Another direction being investigated is exploration towards establishing a quantum neural network (QNN) analogy of the universal approximation theorem, which attempts to provide the foundation for expressivity of classical neural networks. Recent work by Wu Y. et al. [2] discusses the generalization of expressivity of QNN for learning targets (for example observables of input wave functions), while focusing on fully connected architectures. The findings suggest that an accurate approximation is possible only when the input wave functions (target function) in the dataset do not exhaust the entire Hilbert space that the quantum circuit acts on, and more precisely, the Hilbert space dimension of the former has to be less than half of the Hilbert space dimension of the latter. If this requirement cannot be satisfied by the dataset, the expressivity capabilities can be restored by adding one ancillary qubit where the wave function is always fixed at input. Hybrid quantum-classical QML protocols with mixed target states can be exploited in already available experimental platforms with potential promising applications in quantum computing and noise sensing.

Another aspect of quantum ML that is being explored is quantum generative adversarial learning (QGAN) which is a promising strategy to use quantum devices for (quantum) estimation or generative machine learning tasks. The recent work by Braccia et al. [3], discusses the convergence behaviors of a QGAN training process and proposes new algorithms for QGAN training for mixed states which are expected to work better than previously used techniques. It is observed that the states obtained from currently available NISQ devices may display a “chaotic” behaviour during training, without achieving convergence. Results from this work show that algorithms based on the adaptation of a gradient-based technique allow provable convergence with bilinear score functions while the algorithms based on convex optimization techniques are shown to be especially suited for non-parametric states that are iteratively updated via a quantum circuit. Also, endowing the scheme with the ability to process entangled copies of the target state results in enhanced performance.

In Ref. [4] it is shown how generative models can be enhanced via quantum correlations, possibly paving the way to a quantum advantage in a generative ML setting. They show how quantum correlations offer a powerful resource, using a numerical example on a practical problem, and even prove a separation in expressive power between a particular classical generative model and its quantum counterpart. The separation can be explained in terms of the quantum nonlocality and quantum contextuality. Furthermore, there may be classical ML improvements drawing from QI foundations (also known as ‘quantum inspired’ research, for example using MPS or TN). As an outlook, the authors show how their specific proof and example can be extended to more general model paradigms. It would be great to see practical implementations of generative modeling advantage on hardware in the coming years.

In variational quantum algorithms, it is known that highly expressive ansatze can access a close approximation of a desired solution to a number of variational problems and provide a flexible paradigm for programming near-term quantum computers. However, the circuit ansatz must have sufficiently large gradients to allow for proper training; much has been researched about the 'barren plateau' phenomenon in this context and it is seen as a big challenge observed in variational QML and QGAN. Holmes et al. [5] attempts to derive a fundamental relationship between two essential ansatze properties: expressibility and trainability. The resulting bounds in the work by Holmes et al. [5] indicates that highly expressive anatze exhibit flatter cost landscapes and therefore will be harder to train. Therefore, on one hand making the quantum neural network more expressive leads to better representation of a function. However, on the other hand, larger expressivity leads to smaller gradients, which lead to the barren plateau problem and convergence issues. Strategies such as using structured ansatze, physics informed circuits, correlating parameters and restricting rotation angles have been proposed to avoid or mitigate barren plateaus. Further exploring these and other strategies will be an important direction for future research.

Also recently, the connection between quantum machine learning models and kernel methods was explored, more rigorously than previously. Oftentimes, a variational quantum circuit in QML is compared to a neural network (and aptly named a ‘QNN’). Such circuits typically consist of encoding circuit, a variational ansatz, and a measurement. But conceptually, these quantum models really only consist of two parts: the data encoding/embedding and the measurement. Training a quantum model is then simply the problem of finding the measurement that minimises a data-dependent cost function. The author in [6] argues such quantum models are typically much more related to kernel methods than NN’s. Besides being conceptually very interesting, this view also yields several interesting benefits: for example, it refocuses the attention of a potential quantum advantage over classical methods to the way data is encoded into quantum states. Also, kernel-based training guarantees to find the global optimum, which is typically hard for variational circuit training. It is also shown that the main drawback of kernel-based methods is the same reason why NN methods superseded kernel methods in Big Data applications: while NN training is theoretically very hard, empirically O(N) training time can yield good results, while kernel training requires a runtime of at least O(N^2) in a classical feedback loop, even though the training is convex (‘efficient’). However, the model could also be trained using quantum algorithms, and is known to have a quadratic speedup to O(N), although such protocols require large fault-tolerant quantum hardware.

Quantum machine learning is a promising candidate for quantum advantage and recent results indicate the possibility of an advantage for certain tasks. However, this is still ongoing research with various aspects that need to be investigated. A number of algorithms are proposed claiming enhanced performance in numerics but in particular the training and learning problems are essential to explore under future scopes. Similarly, further generalizations on learning targets, regression, and classification tasks are required for implementation on experimental platforms.

References:
[1] Huang H. Y., Kueng R., J. Preskill, “Information-theoretic bounds on quantum advantage in machine learning” arXiv:2101.02464 (Jan. 2021)
[2] Y. Wu, J. Yao, P. Zhang, H. Zhai, “Expressivity of Quantum Neural Networks” arXiv:2101.04273 (Jan. 2021)
[3] P. Braccia, F. Caruso, L. Banchi, “How to enhance quantum generative adversarial learning of noisy information” arXiv:2012.05996 (Dec. 2020)
[4] Xun Gao, Eric R. Anschuetz, Sheng-Tao Wang, J. Ignacio Cirac, Mikhail D. Lukin, “Enhancing Generative Models via Quantum Correlations” arXiv:2101.08354 (Jan. 2021)
[5] Z.Holmes, K. Sharma, M. Cerezo, P.J. Coles, “Connecting ansatz expressibility to gradient magnitudes and barren plateaus” arXiv:2101.02138 (Jan. 2021)
[6] Maria Schuld, “Quantum Machine Learning Models are kernel methods” arXiv:2101.11020 (Jan. 2021)
The authors of this work provides a comparative study of the overhead of state distillation and code-switching, and enables a deeper understanding of these two approaches to fault-tolerant universal quantum computation (FTQC). FTQC requires a gate set that involves at least one non-Clifford gate, which is typically generated through such protocols. State distillation relies on first producing many noisy encoded copies of a T state (magic state) and then processing them using Clifford operations to output a high fidelity encoded version of the state. The high-fidelity T state can be then used to implement the non-Clifford T gate. State generation and distillation is still a resource intensive process, despite significant recent improvements. A compelling alternative is ‘code-switching’, for example via gauge fixing to a 3D topological code. However, moving to 3D architectures is experimentally difficult, even if it significantly reduces the overhead compared to state distillation.

This work estimates the resources needed to prepare high-fidelity T states encoded in the 2D color code. This is accomplished via either state distillation or code-switching assuming that both approaches are implemented using noisy quantum-local operations in 3D. In particular, these two approaches were compared by simulating noisy circuits built from single-qubit state preparations, unitaries and measurements, and two-qubit unitaries between nearby qubits.

It is reported that the circuit noise threshold achieved for the code-switching implementation in this work is the highest presented to date and is equivalent to the circuit-noise threshold for the state distillation scheme with the 2D color code.

A direct way to switch between the 2D color code and the 3D subsystem color code is explored. The method exploits a particular gauge-fixing of the 3D subsystem color code for which the code state admits a local tensor product structure in the bulk and can therefore be prepared in constant time. The restriction decoder was adapted for the 3D color code with a boundary and produced a high error threshold. However, the failure probability of implementing the T gate with code-switching and the estimated T gate threshold is found to be lower than the one calculated for the state distillation.

The results suggest that code-switching does not offer substantial savings over state distillation in terms of both space overhead, i.e., the number of physical qubits required, and space-time overhead (the space overhead multiplied by the number of physical time units required) for most circuit noise error rates. Code switching holds a lot of promise, but is dependent on the clever design of codes that have transversal implementations of non-Clifford gates, and such a protocol is at the moment inferior to the state distillation. There exist other protocols that rival state distillation, like the pieceable fault-tolerant implementation of the CCZ gate, which in general shows the importance of having a protocol for non-Clifford gate implementation with a relatively small number of qubits and operations.
Recently, a team from the University of Science and Technology of China (USTC) demonstrated an experiment on a photonic quantum computer that outperformed even the fastest classical supercomputer in a computational task. Such kind of experiments are targeting algorithms and hardware platforms that can provide “quantum supremacy”, which occurs when a quantum computer is outperforming a classical computer.

A photonic quantum computer harnesses particles of light (photons) and consists of a complex array of optical devices, such as light sources, beam splitters, mirrors and photon detectors, that shuttle photons around. In such a computer, the quantum computation is based on a process called Boson Sampling, which is a task deliberately designed to prove quantum supremacy. Boson sampling is trying to understand what the distribution of photons is going to be at the output of a photonic interferometer. In the case of the quantum device implementation of boson sampling, the problem is solved `by itself’ since the distribution of the measured output is the desired photon distribution. In the case of the classical computer, a large computation is required to find the photon distribution, which increases with the size of the problem since the photon’s quantum properties lead to an exponentially increasing number of possible distributions. If operated with large numbers of photons and many channels, the quantum computer will produce a distribution of numbers that is too complex for a classical computer to calculate. In the new experiment, up to 76 photons traversed a network of 100 channels, which is a much larger amount than previously demonstrated, both experimentally and numerically.

This claim for quantum supremacy comes to reinforce what Google presented last year with their superconducting qubit-based quantum computer. The main difference between the two experiments in terms of the result is that the photonics experiment can create many more possible output states: ~1030 of them compared to ~1016. Such a large number makes it infeasible to calculate the whole probability distribution over outputs and store it for future generation of samples (something other researchers suggested as a rebuttal against Google’s claims, but which can certainly not hold in this new experiment).

Although researchers are currently looking for ways to get similar results with classical computers, it has not yet been successful. The main concern around this quantum experiment is the photon loss. It was reported that up to ~70% of the photons get lost on their way through the beam splitter network, allowing only ~30% to be detected. Typically, that amount of photon loss would be considered fatal for quantum supremacy. Furthermore, the classical simulations that are used for comparisons require fixing the rate of noise and then letting the numbers of photons and modes go to infinity. However, any real experiment has a fixed number of photons and modes (in USTC’s case, they’re ~50 and ~100 respectively).

Achieving the goal of quantum supremacy through such kind of experiments does not indicate the definitive, general, superiority of quantum computers over classical computers, since such kind of problems are deliberately designed to be hard for classical computers. On the other hand, it would also be an understatement to say this experiment is `only a proof of principle’, since boson sampling could have actual practical applications, for example solving specialized problems in quantum chemistry and mathematics.

Currently, most proposals in the literature apply boson sampling to vibronic spectra or finding dense subgraphs, but it is not certain whether these proposals will yield real speedups for a task of practical interest that involves estimating specific numbers (as opposed to sampling tasks, where boson sampling almost certainly does yield exponential speedups).

Future research will focus both on algorithm development, exploiting the particular characteristics of such a specialized quantum device, as well as experimental improvements such as decreased photon loss, higher quality sources and detectors, and larger number of modes. The described experiment presents a promising indication of this sub-field of quantum computing, and we keep a close eye on future developments.
In this paper, researchers from Amazon AWS & IQIM present an architecture for a fault-tolerant quantum computer, that is based on hybrid acoustic-electro devices to implement a stabilized cat code with highly biased noise, dominated by dephasing. To combat these sources of noise, they concatenated the cat code with an outer code that focuses mostly on correcting the dephasing errors, based on the repetition code and the surface code. The assumed error model is critical, since it will affect the fidelities of all required operations (initialization, measurement, gates, etc.) based on which the results are compared to previous works. Therefore, a detailed error analysis of measurements and gates, including CNOT and Toffoli gates is presented according to this realistic noise model.

Fault-tolerant quantum computing requires a universal set of gates, which can be divided into two categories, namely gates that belong to the Clifford group and gates that do not. Clifford gates can be typically achieved easily for a variety of codes, however non-Clifford gates require sophisticated protocols to create and then purify to increase their fidelity, like the magic state preparation/distillation protocol. A novel magic-state distillation protocol for Toffoli states is introduced here (injected via lattice surgery), which in combination with the error correction techniques that were used, result in a lower overhead compared to previous works. Actually, it is estimated that the factory that generates the magic states only accounts for approximately 7% of the total resource overhead requirements, with the other 93% coming from the rotated surface code.

In terms of quantum advantage, the authors find that with around 1,000 superconducting circuit components, one could construct a fault-tolerant quantum computer that can run circuits which are intractable for classical supercomputers.

However, when comparing this work to other related works, one should keep in mind that the assumed gate fidelities and the assumed error model can greatly affect the presented results. The error model in this work assumes Z error rates that are far less optimistic than typically assumed for transmon qubits and due to the cat state encoding there is bit-flip noise suppression that can naturally lead to increased performance. Furthermore, transmon architecture resource estimates are based on a simple depolarizing noise model, whereas this noise model has been derived from first principles modeling of the hardware, making the analysis more realistic.

Moreover, the authors claim to have a comparable or up to 3 times fewer qubits required compared to other superconducting transmon qubit architectures, based on the assumed gate fidelities. Also, similar runtime figures are reported compared to other superconducting transmon qubit architectures, however an important distinction of this protocol is that the magic states are created slightly faster than they can be transported to the main algorithm, whereas in other architectures the main algorithm has to wait for the magic states to be created which is a bottleneck in the runtime.

Although such a protocol shows promise for fault-tolerant quantum computing, the injection of magic states comes with an additional qubit cost for data access and routing. The choice of routing solution leads to a lower bound on runtime execution, so more careful optimization of routing costs and speed of magic state injection is crucial.
In this edition of Active Quantum Research Areas (AQRAs), we discuss recent research on barren plateaus in variational quantum algorithms.

Parametrized quantum circuits (PQCs) are employed in Quantum Neural Networks (QNNs) and Variational Quantum Algorithms (VQAs), which are typical researched algorithms that may allow for near-term quantum advantage. QNNs have been recently proposed as generalizations of classical neural networks that can potentially analyse quantum data more efficiently. VQAs have been proposed as a more resilient to noise circuit for the NISQ era.

Both algorithms may rely on gradient based optimization to iteratively minimize a cost function evaluated by measuring output(s) of a quantum processor. Training PQCs involves a hybrid quantum-classical optimization loop. Typically, the problem is encoded in a cost (or loss) function that is ideally efficient to evaluate on a quantum computer but computationally expensive for a classical one. While the quantum computer evaluates the cost, a classical optimizer updates some (usually continuous) parameters associated with the quantum operations. PQCs with fixed gate structure are often referred to as a variational ansatze.

A `barren plateau' is the phenomenon of exponentially vanishing gradients in sufficiently expressive parametrized quantum circuits and it was first coined in the context of quantum circuits in [McClean2018]. It has been established that the onset of a barren plateau regime depends on the cost function, although the particular behaviour has been demonstrated only for certain classes of cost functions. Barren plateau landscapes correspond to gradients that vanish exponentially in the number of qubits. Therefore, an exponentially large precision is needed to navigate through the landscape. Such landscapes have been demonstrated for variational quantum algorithms and quantum neural networks with either deep circuits or global cost functions. Even if one manages to avoid these barren plateaus, there is an additional difficulty of optimizing in the presence of the hardware noise that defines NISQ devices, which is expected to modify the cost landscape.

The classical optimization step, integral to these algorithms, can be implemented using either gradient-free or gradient-based methods. At first glance, the use of the former seems more effective as the evaluation of the cost function is subject to imperfection(s). However, further research developed methods of evaluating gradients analytically (i.e. without relying on finite differences), and the access to gradients was proven to speed up the convergence to local minima.

In the absence of a barren plateau, the determination of a minimizing direction in the cost function landscape does not require an exponentially large precision, meaning that one can always navigate through the landscape by measuring expectation values with a precision that grows at most polynomially with the system size. Polynomial overhead is the standard goal for quantum algorithms, which aim to achieve a speedup over classical algorithms that often scale exponentially. Hence, the absence or existence of a barren plateau can determine whether or not a quantum speedup is achievable.

Recently, particularly in the past month of November 2020, many interesting papers appeared on the arXiv describing research on these barren plateaus, which we briefly summarize here.

In [Pesah2020], the authors analyse gradient scaling for the parameters of their proposed Quantum Convolutional Neural Network (QCNN). The main result of this research is the proposition of a novel method for analyzing the scaling of the variance and lower bounding the variance for the QCNN architecture.

In [Zhang2020], the authors propose a tree tensor architecture for QNNs. The main result is the proof that tree tensor QNNs have gradients that vanish polynomially with the qubit number, irrespective of which encoding methods are employed.

In [Fontana2020], the authors examine ways to exploit the symmetries and degeneracies that exist in PQCs and investigate how hardware noise affect such symmetries. The main result is that they find an exponentially large symmetry in PQCs, yielding an exponentially large degeneracy of the minima in the cost landscape. Also, they show that noise (specifically non-unital noise) can break these symmetries and lift the degeneracy of minima, making many of them local minima instead of global minima. The main contribution is an novel optimization method, which exploits underlying symmetries in PQCs.

In [Uvarov2020], the authors derive a lower bound on the variance of the gradient, which depends mainly on the width of the circuit causal cone of each term in the Pauli decomposition of the cost function. The main contribution is that this result further clarifies the conditions under which barren plateaus can occur. Given a Hamiltonian, they are able to estimate its susceptibility to the barren plateaus, therefore one can pre-process Hamiltonians in order to make the optimization more viable.

In [Arrasmith2020], the authors answer the question of whether only gradient based optimizers suffer from the vanishing gradient phenomenon. The main result is that gradient-free optimizers do not solve the barren plateau problem and prove that cost function differences are exponentially suppressed in a barren plateau. Hence, without exponential precision, gradient-free optimizers will not make progress in the optimization.

In [Anand2020], the authors explore Natural Evolutionary Strategies (NES) for the optimization of randomly-initialized PQCs in the region of vanishing gradients. The main result is that using the NES gradient estimator the exponential decrease in variance can be alleviated. They compare their two approaches, namely the exponential and separable NES, against standard gradient descent and show that optimization of randomly initialized PQCs can be performed with significantly less circuit evaluations using NES, while achieving comparable accuracy to gradient based methods. They also show that NES methods can be used to amplify gradients and improve parameter initialization for gradient-based approaches.

It is clear from the ongoing research that the vanishing gradient problem is substantial when working with PQCs, which in turn seems to be a crucial aspect for proving quantum advantage in the NISQ era. Although different approaches have been suggested to overcome the limitations arising from barren plateaus, there is not one simple way to do it. Most of the strategies discussed, focus on the near term heuristic solutions that can be obtained from PQCs, with the main goal being a polynomial instead of an exponential overhead with the number of qubits for the precision of the cost function, which is a way one can prove quantum advantage. This is done by employing new optimization strategies, optimizers, re-design of the Hamiltonian/ansatz, exploitation of symmetries and many other ways. Researchers in this field will likely keep inventing new ways to avoid or mitigate barren plateaus for the foreseeable future, and these recent developments have ushered in a new era of understanding of this domain.

References:
[Anand2020] Natural Evolutionary Strategies for Variational Quantum Computation, by Abhinav Anand, Matthias Degroote and Alán Aspuru-Guzik, arXiv:2012.00101
[Arrasmith2020] Effect of barren plateaus on gradient-free optimization, by Andrew Arrasmith, M. Cerezo, Piotr Czarnik, Lukasz Cincio, Patrick J. Coles, arXiv:2011.12245
[Fontanta2020] Optimizing parametrized quantum circuits via noise-induced breaking of symmetries, Enrico Fontana, M. Cerezo, Andrew Arrasmith, Ivan Rungger, Patrick J. Coles, arXiv:2011.08763
[Pesah2020] Absence of Barren Plateaus in Quantum Convolutional Neural Networks, by Arthur Pesah, M. Cerezo, Samson Wang, Tyler Volkoff, Andrew T. Sornborger and Patrick J. Coles, arXiv:2011.02966
[McClean2018] Barren plateaus in quantum neural network training landscapes, by McClean, J.R., Boixo, S., Smelyanskiy, V.N., R. Babbush and H. Neven, Nat Commun 9, 4812 (2018)
[Uvarov2020] On barren plateaus and cost function locality in variational quantum algorithms, by Alexey Uvarov and Jacob Biamonte, arXiv:2011.10530
[Zhang2020] Toward Trainability of Quantum Neural Networks, by Kaining Zhang, Min-Hsiu Hsieh, Liu Liu and Dacheng Tao, arXiv:2011.06258
Qu&Co in collaboration with our academic advisor Oleksandr Kyriienko at the University of Exeter has developed a proprietary quantum algorithm which promises a generic and efficient way to solve nonlinear differential equations. The algorithm is compatible with near-term quantum-processors, with promising extensions for fault-tolerant implementation. Using a quantum feature map encoding, we define functions as expectation values of parametrized quantum circuits. We use automatic differentiation to represent function derivatives in an analytical form as differentiable quantum circuits (DQCs), thus avoiding inaccurate finite difference procedures for calculating gradients. We describe a hybrid quantum-classical workflow where DQCs are trained to satisfy differential equations and specified boundary conditions. As a particular example setting, we show how this approach can implement a spectral method for solving differential equations in a high-dimensional feature space. From a technical perspective, we design a Chebyshev quantum feature map that offers a powerful basis set of fitting polynomials and possesses rich expressivity. We simulate the algorithm to solve an instance of Navier-Stokes equations, and compute density, temperature and velocity profiles for the fluid flow in a convergent-divergent nozzle.
Enhancing machine learning applications with quantum computing is currently being massively investigated, since it might prove quantum advantage during the NISQ era. Enhancement is typically performed through improvement of the training process of existing classical models or enhancing inference in graphical models. Another way is through the construction of quantum models that generate correlations between variables that are inefficient to represent through classical computation (e.g. quantum neural networks). If the model leverages a quantum circuit that is hard to sample results from classically, then there is potential for a quantum advantage.

The main message of this work is to show quantitatively that when classical models are provided with some training data, even if those were obtained from a quantum model that cannot be computed classically, they can reach similar performance as the quantum model. The authors provide rigorous prediction error bounds for training classical and quantum ML methods based on kernel functions in order to learn quantum mechanical models. It has been proven that kernel methods provide provable guarantees, but are also very flexible in the functions they can learn.

The main contribution of the author’s work lies in the development of quantum kernels and the implementation of a guidebook that generates ML problems which give a large separation between quantum and classical models. The use of prediction error bounds quantifies the separation between prediction errors of quantum and classical ML models for a fixed amount of training data. Typically, the comparison is done based on the geometric difference defined by the closest efficient classical ML model, however, in practice, one should consider the geometric difference with respect to a suite of optimized classical ML models. When the geometric difference is small then a classical ML method is guaranteed to provide similar or better performance in prediction on the data set, independent of the function values or labels. When the geometry differs greatly, there exists a data set that exhibits large prediction advantage using the quantum ML model.

The small geometric difference is a consequence of the exponentially large Hilbert space employed by existing quantum models, where all inputs are too far apart. To circumvent the setback, the authors propose an improvement which enlarges the geometric difference by projecting quantum states embedded from classical data back to approximate classical representation. This proposal allows for the construction of a data set that demonstrates large prediction advantage over common classical ML models in numerical experiments up to 30 qubits. In that way, one can use a small quantum computer to generate efficiently verifiable ML problems that could be challenging for classical ML models.
There is currently a big effort in proving the existence of quantum advantage in NISQ devices, with typical applications including machine learning due to its wide applicability. However, an argument has been raised stipulating whether the advantage being proven in such applications arises from the nature of quantum computing or the way the data is provided. Actually, there are recent works which have shown that, when classical algorithms have an analogous sampling access to data, then they can reach similar performance as their quantum counterparts. Therefore, a fair study concerning quantum advantage in such a setting should only assume classical access to data for both the classical and quantum algorithm.

Most quantum algorithms being explored at the moment are of the variational nature, where a circuit is selected from a parametrized circuit family via classical optimization. However, such algorithms are heuristic and there is no formal evidence that they exhibit a genuine quantum advantage. In their work, the authors describe an exponential quantum speedup via the use of a quantum-enhanced feature space, where each data point is mapped non-linearly to a quantum state and then classified by a linear classifier in the high dimensional Hilbert space. The classification is achieved through a support vector machine (SVM), whose kernel matrix is obtained by measuring the pairwise inner product of the feature states, known as quantum kernel estimation (QKE). This kernel matrix is then given to a classical optimizer that efficiently finds the linear classifier that optimally separates the training data in feature space by running a convex quadratic program.

The advantage of such a quantum learner comes from the ability to recognize classically intractable complex patterns, using a feature map. This implementation combines the ideas of the generalization of soft margin classifiers and rigorously bounding the misclassification error of the SVM-QKE algorithm, thereby providing quantum advantage for the quantum classifier with guaranteed high accuracy. Furthermore, it is proven that this SVM classifier can learn the optimal separator in the exponentially large feature space, while also making it robust against additive sampling errors due to the error mitigation techniques that were used.

Finally, the classification problem showing the exponential quantum speed-up was based on the discrete logarithm problem (DLP) and it is proven that no efficient classical algorithm can achieve an accuracy that is inverse-polynomially better than random guessing, assuming the widely-believed classical hardness of DLP.

The results from this paper are relevant to a broader class of quantum machine learning algorithms exploiting feature maps and which aim to avoid the input-output problem. Although this particular problem is not practically motivated, these results set a positive theoretical foundation for the search of practical quantum advantage in machine learning.

Qu&Co comments on this publication:

In this paper, the meta-VQE algorithm is presented, which is an adaptation of the Variational Quantum Eigensolver (VQE). VQE is a variational algorithm that was originally proposed for finding the ground state energy of a given Hamiltonian by variationally minimizing its expectation value with a parametrized quantum circuit. The cost function of VQE is the expected value of the model Hamiltonian. The variational principle states that this value is an upper bound of the ground state energy, so everything reduces to minimize this value by fine-tuning the parameters of the circuit.

The meta-VQE algorithm is inspired by quantum machine learning algorithms (QML) and is able to learn the ground state energy profile of a parametrized Hamiltonian. First, this circuit is trained with a set of Hamiltonian parameters, which are encoded in the gates of an encoding unitary. By designing a cost function with the expected values of all these Hamiltonian training points, the algorithm extracts the optimal values of the variational parameters. Then, one can compute the energy for other Hamiltonian values by just running the meta-VQE circuit with the parameters obtained in the minimization. In addition, one can also use the result of a meta-VQE training as a starting point of a standard VQE algorithm, the opt-meta-VQE, instead of random initialization.

This technique can make the algorithm converge to the correct solution, while previous variants of VQE often suffer from convergence issues due to the exponential increase of the Hilbert space paired with a random initialized ansatz, typically resulting in a limited probability of finding the ground state as the quantum system increases. The characteristic trait of the meta-VQE is that it can be used to first explore the ground state energies of Hamiltonian parameter space with only a few training points and then use the result as an initial state for a precise VQE, resulting in high precision. Furthermore, meta-VQE can easily be adapted to be a part of other VQE strategies to increase performance. Moreover, the meta-VQE is able to capture global correlation with a few training points alleviating the refined optimization of the individual points in a later step. The authors conclude that the meta-VQE can find the general energy shape but not provide an accurate value, in contrast to standard VQE. However, the opt-meta-VQE proves valuable, achieving better results than standard VQE with random initialization.

A prospective application of quantum computing is solving quantum chemistry problems, however, obtaining exact solutions is difficult due to the lack of general method of obtaining such solutions. Typically the solution lies in finding the ground state energy, even though the energy is not descriptive enough to fully characterize all desired properties of a system. In order to find such properties, many measurements of the wavefunction are required. These measurements are expensive, because the wavefunction cannot be copied and must often be re-prepared before a second measurement is performed. Finding the full wavefunction would require exponentially many measurements, so one option would be to encode many solutions into one measurement by using a machine learned (ML) model. Training of the ML model requires finding exact quantities at several different external potentials. Besides that, one can use density functional theory (DFT), to replace the wavefunction with the one-body density, n(r), which has fewer variables. When DFT is used, instead of the Hamiltonian, the universal functional, F[n], must be found. The quantities required for the classical user to find self-consistent solutions are the exact functional (determining the energy) and the functional derivative. So, in addition to finding F[n], one also must find some other quantity such as the density, n(r), or the Kohn-Sham(KS) potential, vs(r). With these components, one can fully characterize a quantum ground state and solve for other measurable quantities.

The authors propose an algorithm that finds the ML model for F[n] on the quantum computer if a ground-state wavefunction is available. The algorithm leaves the wavefunction largely undisturbed so it can be used as the starting state for another system, greatly reducing the pre-factor required to solve other systems through a quantum counting algorithm to extract descriptive quantities such as the density. Most of this algorithm is kept entirely on the quantum computer to motivate future improvements for speed, but the counting algorithm does allow for information to be output classically. Moreover, the authors demonstrate that the exact Kohn-Sham potential can be solved in a faster way with a gradient evaluated on a cost function for the Kohn-Sham system.

The novelty of the proposed algorithm suggested in this work is the limitation of the number of measurements and re-preparations of the wavefunction especially in the case of time-dependent quantities. Since there is no algorithm for the general case that is exponentially better than the proposed algorithm, limiting the number of measurements and re-preparations of the wavefunction is as best as one can achieve. The proposed algorithm is a combination of several known algorithms including quantum phase estimation, quantum amplitude estimation, and quantum gradient methods that are iteratively used to train a machine learned model.
Page 1 of 8

What's Interesting?

How can we help you?

Invalid Input

Invalid Input

Invalid Input

Invalid Input

Invalid Input

Invalid Input

Copyright © Qu & Co BV
close