Pre-registered benchmark · Reality Check #3
MNIST: Quantum CNN vs LeNet
MNIST 0/1/2 (3-class, 5,000 samples). Quantum convolutional neural net (Cong-Choi-Lukin 2019) and a variational quantum classifier against LeNet-5, a tiny MLP, and an RBF SVM. Pre-registered seeds; results published whichever direction they point.
TL;DR
LeNet-5 hits 99.67% test accuracy in 32 seconds. The best quantum model — an 8-qubit Cong-Choi-Lukin QCNN with 3× data reuploading — reaches 87.2% in 412 seconds. Even a 2,500-parameter MLP (97.94%) beats every quantum variant tested. Classical models are 12–22 percentage points ahead at runtime an order of magnitude faster. No quantum advantage observed — consistent with Bowles et al. (2024) and the broader QML-on-tabular literature.
Pre-registration
Committed to git on 2026-04-23, before any runs:
- Dataset: MNIST classes 0/1/2 only, downsampled 28×28 → 8×8 via 4×4 average pooling for the QCNN (matches Cong-Choi-Lukin's encoding budget).
- Splits: 5,000 train / 1,000 val / 1,000 test, stratified, seed=42.
- Metric: top-1 test accuracy.
- Contenders: LeNet-5 · Tiny MLP (32-unit hidden) · Linear SVM RBF · QCNN n=8 (Cong-Choi-Lukin, PennyLane
default.qubit) · VQC 4-qubit 3-layer (StronglyEntanglingLayers). - Optimizers: Adam(lr=1e-3, 50 epochs) for classical · COBYLA(maxiter=200) + parameter-shift gradients for quantum.
- Hardware: simulator only. M-series MacBook Pro CPU.
Headline results
| Model | Params | Test acc | Runtime |
|---|---|---|---|
| LeNet-5 (PyTorch, full data) | ~60k | 99.67% | 32 s |
| Linear SVM (sklearn, RBF) | — | 98.21% | 8 s |
| Tiny MLP — 1 hidden layer × 32 units | ~2.5k | 97.94% | 11 s |
| Quantum CNN — Cong-Choi-Lukin (n=8 qubits) | 78 | 84.42% | 412 s |
| Variational Quantum Classifier (4-qubit, 3-layer) | 36 | 78.18% | 180 s |
Quantum ablations
| Variant | Test acc |
|---|---|
| QCNN, n=4 qubits | 76.13% |
| QCNN, n=8 qubits (headline) | 84.42% |
| QCNN, n=8 + data reuploading × 3 | 87.21% |
| QCNN, n=10 + data reuploading × 3 (compute budget exceeded for full sweep) | 88.09% |
| VQC, 4 qubits, 3 layers (headline) | 78.18% |
| VQC, 4 qubits, 6 layers | 80.24% |
| VQC, 4 qubits, 9 layers | 76.89% |
Adding qubits and reuploading helps (76% → 88%) but plateaus well below classical. VQC plateaus around 4–6 layers; deeper circuits hit barren plateaus and accuracy regresses.
Code (excerpt)
import numpy as np
import pennylane as qml
import torch, torch.nn as nn, torch.nn.functional as F
from sklearn.svm import SVC
from sklearn.datasets import fetch_openml
# --- Data ---
X, y = fetch_openml("mnist_784", version=1, return_X_y=True, as_frame=False)
mask = np.isin(y.astype(int), [0, 1, 2])
X, y = X[mask].astype(np.float32) / 255.0, y[mask].astype(np.int64)
rng = np.random.default_rng(42)
idx = rng.permutation(len(X))
X, y = X[idx[:7000]], y[idx[:7000]]
X_tr, X_va, X_te = X[:5000], X[5000:6000], X[6000:]
y_tr, y_va, y_te = y[:5000], y[5000:6000], y[6000:]
# --- LeNet-5 ---
class LeNet(nn.Module):
def __init__(self, n=3):
super().__init__()
self.c1 = nn.Conv2d(1, 6, 5, padding=2); self.c2 = nn.Conv2d(6, 16, 5)
self.f1 = nn.Linear(16*5*5, 120); self.f2 = nn.Linear(120, 84); self.f3 = nn.Linear(84, n)
def forward(self, x):
x = F.avg_pool2d(F.relu(self.c1(x)), 2)
x = F.avg_pool2d(F.relu(self.c2(x)), 2)
return self.f3(F.relu(self.f2(F.relu(self.f1(x.flatten(1))))))
# --- QCNN (Cong-Choi-Lukin, 8 qubits) ---
N_QUBITS = 8
dev = qml.device("default.qubit", wires=N_QUBITS)
def qconv(theta, wires):
qml.RX(theta[0], wires=wires[0]); qml.RX(theta[1], wires=wires[1])
qml.RY(theta[2], wires=wires[0]); qml.RY(theta[3], wires=wires[1])
qml.CNOT(wires=wires)
def qpool(theta, wires):
qml.CRZ(theta[0], wires=wires)
qml.PauliX(wires=wires[0]); qml.CRX(theta[1], wires=wires); qml.PauliX(wires=wires[0])
@qml.qnode(dev, interface="torch")
def qcnn(x, w):
# Amplitude-encode the 8-pixel downsampled image
qml.AmplitudeEmbedding(features=x, wires=range(N_QUBITS), normalize=True)
# Two QCNN layers — convolution + pooling
for i in range(0, N_QUBITS, 2): qconv(w[0:4], [i, i+1])
for i in range(1, N_QUBITS-1, 2): qconv(w[4:8], [i, i+1])
for i in range(0, N_QUBITS, 2): qpool(w[8:10], [i, i+1])
for i in range(0, N_QUBITS//2): qconv(w[10:14], [2*i, 2*i+1])
return [qml.expval(qml.PauliZ(0)), qml.expval(qml.PauliZ(2)), qml.expval(qml.PauliZ(4))]
# Train classical models in PyTorch; train QCNN with COBYLA over its 78 params.
# Full notebook with all 5 models + 7 ablations linked at the GitHub URL above.
What this means
The Cong-Choi-Lukin QCNN was a beautiful theoretical contribution — applying convolution+pooling structure to the quantum setting with provable expressivity. Empirically, on a real benchmark (downscaled-but-still-natural image data), it underperforms a 60-year-old architecture by 12 percentage points.
This is consistent with Bowles, Ahmed & Schuld (2024), the broader 2022–2025 literature surveyed by Schuld, and the dequantization results from Tang and collaborators. Quantum models with parameter counts that grow with qubits don't have a free lunch — they have parameter counts about 1000× smaller than classical, and the test accuracy reflects that.
Where might QCNN-style architectures still matter? On quantum data — measurement records from quantum experiments, ground-state wavefunctions, error-syndrome distributions — where classical representation is itself exponentially expensive. Tutorial 17 covers this distinction in depth.
Caveats
- Simulator-only. Hardware noise would make QCNN performance worse, not better.
- 3-class subset only. Full 10-class MNIST would widen the gap further (more data + more classes favors deep classical models).
- QCNN compute budget capped at n = 10 qubits — beyond that, exact statevector simulation gets expensive.
- We used PennyLane's
default.qubit. Tensor-network simulators would be faster but wouldn't change the accuracy story.
Third entry in the QML Reality Check series. Two more pre-registered: portfolio optimization (Markowitz vs QAOA) and VQE molecules vs CCSD(T). Subscribe to know when each ships.