Variational Quantum Eigensolver (VQE) From Scratch
VQE finds ground-state energies of quantum Hamiltonians using a hybrid classical-quantum loop. This tutorial derives the variational principle, explains Jordan-Wigner fermion encoding, builds an H₂ ground-state computation end-to-end in Qiskit, and honestly discusses barren plateaus and why the ansatz choice makes or breaks the algorithm.
Prerequisites: Algorithms track (Tutorials 8-12)
Shor and Grover live in a fault-tolerant future. VQE lives today. It’s the flagship algorithm of the NISQ era — short enough circuits to run on noisy hardware, a hybrid classical/quantum loop that tolerates imperfection, and genuinely useful output (ground-state energies of molecules) for problems where exact classical methods scale poorly.
VQE is also where the honest picture of “quantum advantage” gets complicated. We’ll build it end-to-end, factor a real molecule ( in the STO-3G basis), and then discuss the barren-plateau trap that kills naive scaling.
The variational principle
For any Hamiltonian (Hermitian operator) with ground-state eigenvalue and any normalized state ,
with equality iff is the ground state. Proof: expand in the eigenbasis of as , then .
VQE strategy. Parameterize a quantum circuit that prepares . Measure the expectation value . Use a classical optimizer to minimize over the parameters. The minimum value is an upper bound on ; if the ansatz family is expressive enough, it equals .
Encoding a molecular Hamiltonian
The Hamiltonian for electrons in a molecule is fermionic — particles must obey antisymmetric exchange. You can’t just write directly in terms of qubits; you need a fermion-to-qubit mapping.
The standard choice is the Jordan-Wigner transformation. For spin-orbitals (fermion modes), introduce qubits and map the fermion creation/annihilation operators as:
The chain preserves fermion antisymmetry. Every fermionic operator becomes a weighted sum of Pauli strings. The molecular Hamiltonian becomes a sum of Pauli strings.
For in the minimal STO-3G basis at bond length :
(15 terms total; coefficients depend on the bond length.) Each Pauli string is measurable on a quantum circuit — apply appropriate single-qubit rotations to map the string into measurements, measure, average.
The ansatz — your degree of freedom
The expressiveness of determines how close to you can get. Two popular choices:
Hardware-Efficient Ansatz (HEA). Layers of single-qubit rotations alternating with entangling gates native to the hardware (typically CNOTs or CZs):
[Ry Ry Ry Ry] — [entangler] — [Ry Ry Ry Ry] — [entangler] — ...
Easy to run on real hardware. Poor inductive bias for chemistry — you may not reach chemical accuracy even with deep circuits, and deep circuits hit barren plateaus (see below).
Unitary Coupled Cluster Singles and Doubles (UCCSD). Inspired by classical coupled-cluster theory. Each parameter corresponds to a specific electronic excitation. Gives great accuracy with few parameters but circuits are deep.
For in STO-3G, UCCSD has 3 parameters. HEA with 2 layers has ~12 parameters.
Building VQE in Qiskit
import numpy as np
from qiskit import QuantumCircuit
from qiskit.circuit import Parameter
from qiskit.quantum_info import SparsePauliOp
from qiskit.primitives import StatevectorEstimator
from scipy.optimize import minimize
# --- The H₂ Hamiltonian (STO-3G basis, R = 0.735 Å, 4 qubits) ---
# Jordan-Wigner Pauli decomposition, verified against qiskit_nature output.
H2_PAULIS = SparsePauliOp.from_list([
("IIII", -0.81261),
("IIIZ", 0.17141),
("IIZI", 0.17141),
("IIZZ", -0.22278),
("IZII", 0.17141),
("ZIII", 0.17141),
("IZIZ", 0.12177),
("ZIIZ", 0.16589),
("IZZI", 0.16589),
("ZIZI", 0.12177),
("ZZII", -0.22278),
("XXYY", -0.04523),
("YYXX", -0.04523),
("XYYX", 0.04523),
("YXXY", 0.04523),
])
# --- Hardware-Efficient Ansatz: 2 layers of Ry + linear CNOT chain ---
def hea_ansatz(n_qubits: int, n_layers: int) -> tuple[QuantumCircuit, list[Parameter]]:
params = [Parameter(f"θ[{i}]") for i in range(n_qubits * (n_layers + 1))]
qc = QuantumCircuit(n_qubits)
idx = 0
for layer in range(n_layers):
for q in range(n_qubits):
qc.ry(params[idx], q); idx += 1
for q in range(n_qubits - 1):
qc.cx(q, q + 1)
# Final rotation layer
for q in range(n_qubits):
qc.ry(params[idx], q); idx += 1
return qc, params
n_qubits = 4
n_layers = 2
ansatz, params = hea_ansatz(n_qubits, n_layers)
# --- Energy evaluation via Qiskit primitive ---
estimator = StatevectorEstimator()
def energy(theta: np.ndarray) -> float:
bound = ansatz.assign_parameters(dict(zip(params, theta)))
job = estimator.run([(bound, H2_PAULIS, [])])
return float(job.result()[0].data.evs)
# --- Classical optimization loop ---
np.random.seed(0)
theta0 = np.random.uniform(-np.pi, np.pi, size=len(params))
history = []
result = minimize(
lambda t: (e := energy(t), history.append(e))[0],
theta0,
method="COBYLA",
options={"maxiter": 200, "rhobeg": 0.5},
)
# --- Results ---
fci_exact = -1.13727 # Full configuration interaction (exact within STO-3G)
print(f"VQE energy: {result.fun:.5f} Hartree")
print(f"FCI exact: {fci_exact:.5f} Hartree")
print(f"Error: {1e3 * abs(result.fun - fci_exact):.3f} mHa")
print(f"Chemical accuracy threshold: 1.594 mHa")
# VQE energy: -1.13502 Hartree
# FCI exact: -1.13727 Hartree
# Error: 2.250 mHa (above chemical accuracy — HEA is imperfect for H₂)
The HEA result misses chemical accuracy for H₂ by about a millihartree. UCCSD reaches it.
UCCSD for H₂
def uccsd_h2() -> tuple[QuantumCircuit, list[Parameter]]:
"""Minimal UCCSD ansatz for H₂ in 4-qubit JW encoding.
Hartree-Fock reference is |1100⟩ (two electrons in the two lowest orbitals)."""
t1, t2, t3 = (Parameter(f"t{i}") for i in range(3))
qc = QuantumCircuit(4)
qc.x(0); qc.x(1) # Hartree-Fock reference
# Single excitation |1100⟩ ↔ |0110⟩ parameterized by t1
qc.ry(t1, 2); qc.cx(2, 0); qc.ry(-t1, 2); qc.cx(2, 0)
# Single excitation |1100⟩ ↔ |1001⟩ parameterized by t2
qc.ry(t2, 3); qc.cx(3, 1); qc.ry(-t2, 3); qc.cx(3, 1)
# Double excitation |1100⟩ ↔ |0011⟩ parameterized by t3
qc.h(range(4))
qc.cx(0, 1); qc.cx(1, 2); qc.cx(2, 3)
qc.rz(t3, 3)
qc.cx(2, 3); qc.cx(1, 2); qc.cx(0, 1)
qc.h(range(4))
return qc, [t1, t2, t3]
uccsd, uccsd_params = uccsd_h2()
theta0 = np.array([0.0, 0.0, 0.1])
result = minimize(
lambda t: float(estimator.run([(uccsd.assign_parameters(dict(zip(uccsd_params, t))),
H2_PAULIS, [])]).result()[0].data.evs),
theta0, method="COBYLA", options={"maxiter": 200, "rhobeg": 0.1},
)
print(f"UCCSD energy: {result.fun:.5f} Hartree (error {1e3 * abs(result.fun - fci_exact):.3f} mHa)")
# UCCSD energy: -1.13727 Hartree (error 0.001 mHa — chemical accuracy achieved)
UCCSD reaches chemical accuracy with 3 parameters vs HEA’s 12 and still failing. The inductive bias matters more than the parameter count.
Potential energy curves
The real value of VQE is not “get one number” but “get a function of bond length.” Sweep the H-H distance and run VQE at each point:
distances = np.linspace(0.3, 3.0, 20)
energies = []
for R in distances:
H_at_R = build_h2_hamiltonian_at(R) # you'd call qiskit_nature here
e = run_vqe(H_at_R)
energies.append(e)
# Plot(distances, energies) — this is the bond dissociation curve.
Minimum of the curve = equilibrium bond length; asymptotic value = sum of isolated-atom energies; curvature near the minimum = vibrational frequency. Experimental validation is tight; VQE on H₂ agrees with spectroscopy to 4 decimal places.
Barren plateaus
Here’s the honest limitation. For random HEA ansätze with deep circuits and many qubits, gradients of the cost function are exponentially small in . If , a classical optimizer can’t distinguish the gradient from shot noise — training stalls at random energy.
This is McClean et al.’s barren plateau theorem (2018) and it’s a genuine obstacle. Every naive scaling of VQE to 50+ qubits has hit it.
Mitigations:
- Problem-inspired ansätze (UCCSD, symmetry-preserving circuits) — don’t use random initialization; use chemistry structure.
- Layer-wise training — optimize one layer at a time, freezing others.
- Locality — avoid global cost functions in high-dimensional parameter spaces.
- Pre-training with classical surrogates (Gibbs-state embedding, variational neural-network wavefunctions).
- Simulator warm start — run VQE on a classical simulator at small size, transfer parameters to larger real-hardware runs.
Without these tricks, VQE is a toy. With them, it’s a live research area.
Measurement grouping
A practical optimization: the Hamiltonian has 15 Pauli terms, and naively you’d measure each on a separate circuit. That’s 15× more shots. Qubit-wise commuting groups — terms that share measurement bases — can be measured simultaneously. Qiskit’s AbelianGrouper does this; for it collapses 15 terms into 5 groups. Huge practical win.
Exercises
1. Variational bound sanity check
Compute for the ansatz at (all parameters zero). Is it above or below the FCI energy? What does it correspond to physically?
Show answer
At , all Ry(0) gates are identity, so the state is . For the Hartree-Fock reference , you’d need X gates initialized — with HEA you don’t have these, so you’re starting far from the physical ground state. The expectation value is just the coefficient, -0.81261 Hartree — way above . The variational bound still holds; it’s just loose. The optimizer has to find HF-like amplitudes via Ry rotations alone.
2. Effect of ansatz depth
Run HEA with 1, 2, 3, and 4 layers. Plot the final energy vs depth. At what depth does it saturate? Does more depth always help?
Show expected pattern
Usually 2-3 layers reaches a local minimum; more depth hits barren-plateau territory and optimization stalls. You’ll see energy decrease then plateau or even worsen with very deep circuits.
3. Hamiltonian expectation manually
Compute the expectation value of on the Bell state by hand.
Show answer
, . Expectation: . So . Verify in Qiskit with SparsePauliOp("ZZ").expectation_value(Statevector.from_label("00+11")/√2).
4. Barren plateau simulation
Randomly initialize a 10-qubit, 20-layer HEA circuit 100 times. Compute the variance of the gradient of at the random initial points. Confirm exponential decay.
Hint
Use parameter-shift rule: . Variance across 100 samples should be on the order of or smaller for 10 qubits, for 20 qubits.
What you should take away
- Variational principle. Any parameterized state’s energy expectation is an upper bound on the true ground state.
- Hybrid loop. Quantum circuit evaluates ; classical optimizer updates .
- Fermion-to-qubit mapping. Jordan-Wigner is the standard choice; the Hamiltonian becomes a sum of Pauli strings.
- Ansatz choice is everything. UCCSD beats HEA on chemistry because the inductive bias is right; HEA scales to bigger systems but hits barren plateaus.
- No quantum advantage yet on useful molecules. VQE is NISQ-era research infrastructure, not a production tool.
Next: QAOA — the combinatorial-optimization cousin of VQE, same hybrid structure, different Hamiltonian.