Quantum Outpost
variational intermediate · 23 min read ·

QAOA for Combinatorial Optimization

QAOA encodes a combinatorial problem as a cost Hamiltonian, prepares a variational state by alternating cost and mixer evolutions, and uses a classical optimizer to find approximate solutions. This tutorial derives the MaxCut case from scratch, runs it in Qiskit, and honestly compares QAOA to classical baselines like Goemans-Williamson.

Prerequisites: Tutorial 13: Variational Quantum Eigensolver

QAOA — the Quantum Approximate Optimization Algorithm — is VQE’s cousin, reshaped for combinatorial problems. It was proposed by Farhi, Goldstone, and Gutmann in 2014 as a near-term quantum heuristic for problems like MaxCut, graph coloring, Max-SAT, and portfolio optimization. The promise: use shallow circuits to produce approximately optimal solutions better than classical heuristics.

Ten years later, the reality is nuanced. QAOA with small depth (p=1,2p = 1, 2) often matches but doesn’t beat strong classical baselines. At deeper pp, the circuit gets hard to optimize. This tutorial walks through the math, ships a working implementation, and gives you the honest comparison: what QAOA can and can’t do in 2026.

The problem framing

Combinatorial optimization: find a bit string z{0,1}nz \in \{0,1\}^n that maximizes (or minimizes) a cost function C(z)RC(z) \to \mathbb{R}. MaxCut is the canonical instance: given a graph G=(V,E)G = (V, E), partition VV into two sets to maximize the number of edges crossing the cut.

MaxCut cost function:

C(z)=(i,j)E1zizj2with zi{1,+1}.C(z) = \sum_{(i,j) \in E} \frac{1 - z_i z_j}{2} \qquad \text{with } z_i \in \{-1, +1\}.

Edge (i,j)(i,j) contributes 1 if zizjz_i \neq z_j (cut) and 0 if they match.

Encode the cost as a Hamiltonian

Map ziZiz_i \to Z_i (Pauli Z eigenvalues are ±1\pm 1). The cost Hamiltonian is:

HC=(i,j)E1ZiZj2.H_C = \sum_{(i,j) \in E} \frac{1 - Z_i Z_j}{2}.

Eigenvalues of HCH_C on computational-basis states equal the MaxCut values of the corresponding bit strings. The ground state of HCH_C (highest energy actually, since we’re maximizing) is the optimal bit string encoded as a basis state.

The QAOA ansatz

Alternate two unitaries pp times:

Cost unitary. UC(γ)=eiγHCU_C(\gamma) = e^{-i\gamma H_C}. For MaxCut: each edge contributes a eiγ(IZiZj)/2e^{-i\gamma (I - Z_iZ_j)/2} factor, implementable as two CNOTs around an Rz(γ)R_z(\gamma) gate.

Mixer unitary. UM(β)=eiβHMU_M(\beta) = e^{-i\beta H_M}, where HM=iXiH_M = \sum_i X_i. Just Rx(2β)R_x(2\beta) on every qubit.

Full QAOA-pp circuit:

ψ(γ,β)=UM(βp)UC(γp)UM(β1)UC(γ1)+n.|\psi(\vec{\gamma}, \vec{\beta})\rangle = U_M(\beta_p)\,U_C(\gamma_p)\,\ldots\,U_M(\beta_1)\,U_C(\gamma_1)\,|+\rangle^{\otimes n}.

Initial state is the uniform superposition (easy to prepare with Hadamards). Classical optimizer finds γ,β\vec{\gamma}, \vec{\beta} that maximize ψHCψ\langle\psi|H_C|\psi\rangle. Then sample the final state to read off good bit strings.

Why this works (intuition)

At p=p = \infty with appropriately slow-scheduled γ\gamma and β\beta, QAOA becomes the quantum adiabatic algorithm — evolving slowly from the ground state of HMH_M (uniform superposition) to the ground state of HCH_C (optimal bit string). QAOA-pp is a truncation: pp steps of a discretized adiabatic evolution.

At small pp, the circuit is shallow enough to run on NISQ hardware, and the cost landscape in (γ,β)(\vec\gamma, \vec\beta) is often surprisingly well-structured — you can find good solutions by parameter sweep or gradient descent.

MaxCut on a 4-node graph

Graph: cycle C4C_4, edges (0,1),(1,2),(2,3),(3,0)(0,1), (1,2), (2,3), (3,0). Optimal cut: 4 (alternating color assignment). MaxCut value for a random cut = 2; for a uniformly random bit string, expected value 2 as well.

import numpy as np
from qiskit import QuantumCircuit
from qiskit.circuit import Parameter
from qiskit.quantum_info import SparsePauliOp
from qiskit.primitives import StatevectorEstimator, StatevectorSampler
from scipy.optimize import minimize


edges = [(0, 1), (1, 2), (2, 3), (3, 0)]
n = 4


def maxcut_hamiltonian(edges, n) -> SparsePauliOp:
    """H_C = Σ_{(i,j)} (I - Z_i Z_j)/2"""
    terms = []
    for i, j in edges:
        paulis = ["I"] * n
        paulis[i] = "Z"; paulis[j] = "Z"
        terms.append(("".join(reversed(paulis)), -0.5))
        terms.append(("I" * n, 0.5))
    return SparsePauliOp.from_list(terms).simplify()


H_C = maxcut_hamiltonian(edges, n)


def qaoa_circuit(p: int) -> tuple[QuantumCircuit, list[Parameter]]:
    gammas = [Parameter(f{i}") for i in range(p)]
    betas = [Parameter(f{i}") for i in range(p)]
    qc = QuantumCircuit(n)
    qc.h(range(n))                               # |+⟩^n
    for layer in range(p):
        # Cost unitary: e^{-i γ H_C}
        for i, j in edges:
            qc.cx(i, j); qc.rz(2 * gammas[layer], j); qc.cx(i, j)
        # Mixer: e^{-i β H_M}
        for q in range(n):
            qc.rx(2 * betas[layer], q)
    return qc, gammas + betas


p = 2
ansatz, params = qaoa_circuit(p)
estimator = StatevectorEstimator()


def cost(theta: np.ndarray) -> float:
    bound = ansatz.assign_parameters(dict(zip(params, theta)))
    ev = estimator.run([(bound, H_C, [])]).result()[0].data.evs
    return -float(ev)                            # minimize -⟨H_C⟩ to maximize


theta0 = np.array([0.5, 0.5, 0.3, 0.3])           # [γ1, γ2, β1, β2]
result = minimize(cost, theta0, method="COBYLA", options={"maxiter": 200, "rhobeg": 0.2})

print(f"QAOA-{p} expected cut: {-result.fun:.4f}")
print(f"Optimal cut for C₄:   4")
print(f"Random cut (baseline): 2")
# QAOA-2 expected cut: 3.9998 ← basically optimal

At p=2p = 2 on a 4-cycle, QAOA finds the optimal cut essentially exactly. The optimizer converges in ~60 iterations.

Sample the final state for a concrete bit string:

from qiskit.primitives import StatevectorSampler

bound = ansatz.assign_parameters(dict(zip(params, result.x)))
bound_meas = bound.copy(); bound_meas.measure_all()
sampler = StatevectorSampler()
counts = sampler.run([(bound_meas, [])], shots=2048).result()[0].data.meas.get_counts()

for bits, cnt in sorted(counts.items(), key=lambda kv: -kv[1])[:4]:
    z = [1 if b == "0" else -1 for b in reversed(bits)]
    cut_value = sum((1 - z[i] * z[j]) / 2 for i, j in edges)
    print(f"|{bits}⟩: {cnt:>4d} shots, cut = {cut_value}")
# |0101⟩: 1023 shots, cut = 4
# |1010⟩: 1021 shots, cut = 4
# |0110⟩:    2 shots, cut = 2
# ...

The two optimal cuts (01010101 and 10101010, the two alternating-color assignments) dominate the distribution.

Try a small MaxCut in the playground — paste the cost-unitary QASM and see the diagram.

Benchmarking against Goemans-Williamson

The canonical classical baseline for MaxCut is the Goemans-Williamson (GW) SDP-based approximation algorithm, which achieves a worst-case ratio of αGW0.878\alpha_{GW} \approx 0.878 — the cut it returns has value at least 0.878×OPT0.878 \times \text{OPT} in expectation. This is provably optimal under the Unique Games Conjecture; no polynomial-time classical algorithm can do better in the worst case.

How does QAOA compare?

  • QAOA-1 (p=1): worst-case approximation ratio 0.692\approx 0.692 (Farhi 2014). Worse than GW.
  • QAOA-2 (p=2): average-case ratios around 0.790.79 on random 3-regular graphs. Still below GW.
  • QAOA-pp at large pp: ratios approach 1 on many benchmark families, but you need depth that exceeds NISQ capabilities.
  • Warm-started QAOA (Egger et al. 2020): initialize parameters from the GW solution; empirically beats GW on specific instances. Still requires running GW first.

Scaling up — Max-kk-SAT, portfolio optimization

The QAOA framework is fully general. Any problem expressible as

HC=kckσkH_C = \sum_k c_k\,\sigma_k

(where each σk\sigma_k is a Pauli string) fits. Examples:

  • Max-kk-SAT. Each clause CC contributes a tensor-product-of-ZZ-operator term; you build the cost Hamiltonian from the clauses.
  • Portfolio optimization. Markowitz mean-variance H=μTz+λzTΣzH = -\mu^T z + \lambda z^T \Sigma z. The covariance term zTΣzz^T \Sigma z is quadratic in binary variables, hence expressible as a weighted sum of ZiZjZ_i Z_j and ZiZ_i terms.
  • Graph coloring. Encode each vertex’s color in a one-hot binary vector; penalize constraint violations.
  • Traveling Salesman. Requires O(n2)O(n^2) qubits for nn cities; large overhead. Not competitive below fault tolerance.

When to use QAOA

Honest guidance for indie developers and small teams:

  • Prototyping hybrid quantum-classical workflows. QAOA teaches the full hybrid stack; the lessons transfer.
  • Education. Any optimization tutorial is cleaner when the problem has both a classical solver and a quantum one you can compare.
  • Research. Algorithmic improvements to QAOA (warm starts, adaptive schedules, symmetry exploitation) are active areas.
  • Production: not yet. Classical heuristics (simulated annealing, Gurobi, specialized graph algorithms) beat QAOA at problem sizes big enough to matter. Revisit in 2028-2030 with fault-tolerant hardware.

Parameter landscapes

A practical curiosity: the QAOA landscape is more structured than you’d expect. For MaxCut on random graphs, Farhi et al. showed the optimal (γ,β)(\vec\gamma, \vec\beta) concentrate around typical values that generalize across instances. Transferring parameters from one small instance to a larger one often works — called parameter transferability or instance concentration.

This means you can:

  1. Find good parameters on a small classical simulator (up to 20\sim 20 qubits).
  2. Transfer to hardware for larger instances (up to 100\sim 100 qubits).
  3. Skip the expensive optimization on the noisy hardware.

This is the single biggest practical optimization when running QAOA on real hardware.

Exercises

1. Cost Hamiltonian by hand

Write the MaxCut cost Hamiltonian for the triangle graph K3K_3 (three nodes, three edges). Expand and identify the ground state by inspection.

Show answer

HC=32I12(Z0Z1+Z1Z2+Z0Z2)H_C = \tfrac{3}{2}I - \tfrac{1}{2}(Z_0 Z_1 + Z_1 Z_2 + Z_0 Z_2). Max cut value = 2 (any bipartite split of 3 nodes cuts 2 edges). Every bit string except 000|000\rangle and 111|111\rangle achieves 2. Degenerate ground state of HC-H_C is the 6-dimensional subspace of non-constant bit strings.

2. QAOA-1 approximation

Implement QAOA at p=1p = 1 for a 5-node random 3-regular graph. Measure the average approximation ratio vs GW over 10 random instances.

Show approach
import networkx as nx
import cvxpy as cp  # for GW SDP

def goemans_williamson(G):
    n = G.number_of_nodes()
    X = cp.Variable((n, n), symmetric=True)
    constraints = [X >> 0] + [X[i, i] == 1 for i in range(n)]
    obj = 0.5 * sum(1 - X[i, j] for i, j in G.edges())
    cp.Problem(cp.Maximize(obj), constraints).solve()
    # Randomized rounding + average
    V = np.linalg.cholesky(X.value + 1e-8 * np.eye(n))
    # ... random hyperplane

Compare QAOA-1 average cut to GW average cut over 10 instances.

3. Mixer alternatives

The default mixer HM=iXiH_M = \sum_i X_i explores the full 2n2^n bit-string space. For problems with symmetry (e.g., fixed-weight bit strings), a more structured mixer is better. Think: what’s the mixer for problems where you need exactly 3 ones among 8 bits? (Hint: look up “XY mixer” or “ring mixer.”)

Show sketch

Use HM=i(XiXi+1+YiYi+1)H_M = \sum_i (X_i X_{i+1} + Y_i Y_{i+1}) — the XY (swap-preserving) mixer. This preserves total Hamming weight, so the state stays in the “exactly 3 ones” subspace throughout the evolution. Useful for graph coloring and portfolio selection problems.

4. Compare depth

Run QAOA-1, QAOA-2, QAOA-4 on the 4-cycle. Tabulate: (a) number of parameters, (b) circuit depth, (c) best expected cut, (d) wall-clock optimization time. Is more depth always better?

Expected

QAOA-1: 2 params, depth ~8, expected cut ~3.0. QAOA-2: 4 params, depth ~16, expected cut 4.0. QAOA-4: 8 params, depth ~32, expected cut 4.0 but optimizer stalls more often due to local minima. More depth doesn’t always help; the cost landscape gets more complex.

What you should take away

  • QAOA framework. Encode cost as Hamiltonian HCH_C; alternate eiγHCe^{-i\gamma H_C} and eiβXe^{-i\beta \sum X}; optimize (γ,β)(\vec\gamma, \vec\beta) classically.
  • Works for many problems. MaxCut, SAT, portfolio optimization, coloring — anything expressible as a sum of Pauli strings.
  • Classical baseline is Goemans-Williamson (0.878 approximation for MaxCut). QAOA-1 is worse; QAOA-pp beats it only at impractical depth.
  • No quantum advantage demonstrated for real-world optimization as of 2026. Parameter transferability and warm starts are the closest wins.
  • Still worth learning for the hybrid-algorithm framework, which generalizes to VQE, QML, and near-term quantum software patterns.

Next: Quantum ML foundations — using parameterized circuits for classification, the encoding problem, and the parameter-shift rule for gradient-based training.


Weekly dispatch

Quantum, for people who already code.

One serious tutorial per week, plus the industry moves that actually matter. No hype, no hand-waving.

Free. Unsubscribe anytime. We will never sell your email.