Shor's Algorithm: Factoring, Order-Finding, and the End of RSA
Shor's factoring algorithm reduces integer factorization to the problem of finding the multiplicative order of a random element mod N — and uses quantum phase estimation to solve that in polynomial time. This tutorial derives the full algorithm, runs a small instance in Qiskit, and honestly assesses the real-world resource requirements to break RSA-2048.
Prerequisites: Tutorial 11: QFT and Phase Estimation
Shor’s algorithm is the reason governments take quantum computing seriously. Published in 1994 by Peter Shor at Bell Labs, it factors -bit integers in time polynomial in — exponentially faster than any known classical algorithm. That threat directly undermines RSA, the public-key cryptography protecting essentially all internet encryption, banking, and signed software. NIST’s post-quantum standardization (ML-KEM, ML-DSA, 2024) exists because of Shor.
This tutorial derives the algorithm carefully. It’s genuinely subtle — much more than “apply QFT, get answer.” The quantum part does one specific thing (find the multiplicative order of an element mod ), and the factoring follows from classical number theory. Getting the composition right matters.
The classical problem
Given a composite integer , find a nontrivial factor. The best known classical algorithm is the General Number Field Sieve (GNFS), with complexity
For at 2048 bits (the RSA-2048 standard), GNFS is estimated to require on the order of operations — infeasible on any classical computer humanity will build this century.
Shor’s quantum algorithm runs in time — polynomial in the number of digits. On a sufficiently large fault-tolerant quantum computer, RSA-2048 falls in hours.
The reduction: factoring → order-finding
Shor reduces factoring to a problem in number theory that turns out to be tractable for quantum computers. Here’s the reduction.
Order of an element. For coprime to , the multiplicative order of mod is the smallest positive integer such that . The classical gcd and Euclidean-algorithm operations are cheap. The hard part is computing .
Classical number-theoretic fact. If is even and , then and are both nontrivial factors of . (Because , and under the conditions neither factor is a multiple of .)
Probabilistic reduction. Pick a random with , (if , you already found a factor — stop). Find = order of . With probability at least , is even and . In that case, factorize using the gcd formulas. Else, try another .
Everything but “find ” is classical and polynomial. The quantum speedup is entirely in the order-finding.
Order-finding as phase estimation
Define the unitary acting on qubits:
(We extend to act as identity on for , making it unitary on the full -dim space.)
Two key facts:
- ‘s eigenvalues are -th roots of unity. Its order is : .
- Its eigenvectors are related to the roots of unity. Specifically, for each , define
Then .
QPE on with eigenvector yields , from which can be recovered classically.
The superposition trick
We can’t easily prepare — we don’t know or . But a magic identity:
(Expand the sum; the -sum projects to .) So if you prepare in the eigenvector register and run QPE, you effectively run QPE on a uniform superposition of all . The counting-register measurement returns a random for a uniformly random .
From to : continued fractions. If the measurement returns , apply the continued-fraction expansion to find the rational closest to with . With high probability, (or a divisor of it). If not, try again.
The full algorithm
- Pick random with .
- Use quantum phase estimation on with input state and counting qubits.
- Measure the counting register; get a binary fraction .
- Run continued-fraction expansion to extract .
- Verify classically that . If not, go back to step 2.
- If is odd or , go back to step 1.
- Compute . Return the nontrivial factors.
Total quantum queries expected: repetitions of step 2 (each succeeds with constant probability). Total classical work: polynomial in . Total circuit depth per quantum query: dominated by the controlled modular exponentiations, or so.
Example: factor
. Pick (coprime to 15). Order of 7 mod 15: . So (even). . Not . Great.
. . . Done.
For the quantum part, we need to implement controlled- gates on 4 target qubits (). For this small instance, we can hardcode the multiplication circuits. For general we’d need a proper modular-exponentiation circuit, which is the expensive part.
Qiskit implementation (toy scale)
Because the general modular exponentiation is complex to build from gates, and Qiskit’s didactic Shor class was removed in newer versions, we’ll use the manual QPE approach with a hand-built circuit. This is not scalable — but it’s honest about what the real algorithm does.
import numpy as np
from fractions import Fraction
from math import gcd
from qiskit import QuantumCircuit, transpile
from qiskit_aer import AerSimulator
from qiskit.circuit.library import QFT
def c_amod15(a: int, power: int) -> QuantumCircuit:
"""Controlled multiplication by a^power mod 15 on 4 target qubits.
Uses the known permutation structure for a ∈ {2, 4, 7, 8, 11, 13}.
"""
if a not in (2, 4, 7, 8, 11, 13):
raise ValueError("a must be coprime to 15 (one of 2, 4, 7, 8, 11, 13).")
U = QuantumCircuit(4)
for _ in range(power):
if a in (2, 13):
U.swap(2, 3); U.swap(1, 2); U.swap(0, 1)
if a in (7, 8):
U.swap(0, 1); U.swap(1, 2); U.swap(2, 3)
if a in (4, 11):
U.swap(1, 3); U.swap(0, 2)
if a in (7, 11, 13):
for q in range(4): U.x(q)
controlled = U.to_gate(label=f"{a}^{power} mod 15").control(1)
return controlled
def qpe_order_finding(a: int, N: int = 15, n_count: int = 8) -> QuantumCircuit:
qc = QuantumCircuit(n_count + 4, n_count)
# Counting register into superposition
qc.h(range(n_count))
# Target register starts |0001⟩ = |1⟩
qc.x(n_count)
# Controlled powers of U_a
for k in range(n_count):
qc.append(c_amod15(a, 2 ** k), [k, *range(n_count, n_count + 4)])
# Inverse QFT on counting register
qc.append(QFT(n_count, inverse=True, do_swaps=True), range(n_count))
qc.measure(range(n_count), range(n_count))
return qc
def shor_factor(N: int, attempts: int = 10) -> tuple[int, int] | None:
if N % 2 == 0:
return (2, N // 2)
sim = AerSimulator()
for _ in range(attempts):
a = np.random.choice([2, 4, 7, 8, 11, 13])
g = gcd(int(a), N)
if g > 1:
return (int(g), N // int(g))
qc = qpe_order_finding(int(a), N)
tqc = transpile(qc, sim)
counts = sim.run(tqc, shots=1).result().get_counts()
bits = next(iter(counts))
phi = int(bits, 2) / 2 ** len(bits)
frac = Fraction(phi).limit_denominator(N)
r = frac.denominator
if r % 2 != 0 or r == 1: continue
candidate = pow(int(a), r // 2, N)
if candidate == N - 1: continue
f1 = gcd(candidate - 1, N)
f2 = gcd(candidate + 1, N)
if 1 < f1 < N: return (int(f1), N // int(f1))
if 1 < f2 < N: return (int(f2), N // int(f2))
return None
np.random.seed(42)
factors = shor_factor(15)
print(f"15 = {factors[0]} × {factors[1]}")
# 15 = 3 × 5
The circuit has qubits and a few hundred gates after transpilation. On a real IBM Quantum free-tier machine, results are noisy but recognizable — you’ll often need several repetitions for a clean answer. On a simulator, success is near-certain.
What it would take to factor RSA-2048
Three resource estimates from peer-reviewed work (Gidney & Ekerå 2021 is the standard reference):
| Resource | Estimate |
|---|---|
| Logical qubits | ~4,100 |
| Physical qubits (assuming surface code at gate error) | ~20 million |
| Wall-clock time | ~8 hours |
| Total Toffoli-equivalent operations | ~ |
| Total T-gates (post-distillation) | ~ |
As of 2026, the largest known Shor-type factoring demonstration on real hardware is factoring 21 (Wang et al., 2024) using hybrid variational methods on a 5-qubit device; pure Shor demonstrations have been stuck at since Vandersypen et al. 2001. The gap between “15” and “2048-bit” is about 14 orders of magnitude in qubit count.
Google’s Willow (105 physical qubits, Dec 2024) is the first chip to demonstrate below-threshold surface-code error correction. IBM’s Starling roadmap (200 logical qubits by 2029) and Blueprint (2033) are the first credible paths toward thousands of logical qubits. Most experts estimate RSA-2048 falls somewhere between 2030 and 2040.
What post-quantum cryptography actually changes
Shor breaks RSA, DSA, ECDSA, Diffie-Hellman, and all integer-factoring or discrete-log-based public-key cryptography. It does not break:
- Symmetric ciphers like AES-256, ChaCha20 (Grover gives a quadratic, not exponential, speedup — doubling key size restores security).
- Hash functions like SHA-256, SHA-3 (Grover quadratic speedup — use 256-bit hashes for 128-bit post-quantum security).
- Lattice-based schemes like ML-KEM (CRYSTALS-Kyber) and ML-DSA (CRYSTALS-Dilithium) — NIST’s standardized PQC.
- Code-based (Classic McEliece), hash-based (SPHINCS+), isogeny-based (SIKE — though SIKE was classically broken in 2022).
The NIST PQC migration is specifically about replacing the Shor-vulnerable primitives while keeping the Grover-resistant ones at twice the key length. This is tractable for most infrastructure — but non-trivial, because every TLS implementation, every signed software update chain, every hardware security module has to be audited and upgraded.
Exercises
1. Verify the order-finding
For and , compute the order classically. Verify is even and . Then compute the factors via .
Show answer
. So . Odd! Retry with another .
Try : . , even. . Hmm, actually , so — bad case, retry.
Try : . , even. . Good. , . . ✓
2. Continued fractions
Suppose QPE on with 10 counting qubits returns . Use the continued-fraction algorithm to extract candidate denominators bounded by .
Show answer
. The convergents are . The first convergent with denominator ≤ 100 is . So candidate . Verify: is ? If yes, .
In Python:
from fractions import Fraction
Fraction(0.3332).limit_denominator(100)
# Fraction(1, 3)3. Why counting qubits
Why does Shor require counting qubits rather than ?
Show answer
The phase has denominator up to . To distinguish it from neighboring fractions in continued-fraction post-processing, you need precision at least . That requires counting qubits. Shortcuts exist (Kitaev’s 1-qubit iterative QPE) but cost more repetitions.
4. Estimate RSA-1024
If RSA-2048 takes ~20M physical qubits and ~8 hours, roughly what would RSA-1024 take? (Use that factoring -bit numbers is in gates, and that RSA-1024 is already considered broken classically by well-resourced adversaries.)
Show answer
Halving roughly divides gate count by 8 and qubit count by 2. So RSA-1024: ~10M physical qubits, ~1 hour wall-clock. Still infeasible on 2026 hardware but closer — this is why RSA-1024 has been deprecated for classical reasons since ~2013 and is a “canary” target for the first credible Shor demos.
What you should take away
- Shor reduces factoring to order-finding. The quantum speedup is entirely in finding the order of via QPE.
- QPE on the modular-multiplication unitary recovers for random ; continued fractions extract .
- Real RSA-2048 attacks need ~20M physical qubits and ~8 hours wall-clock on fault-tolerant hardware. We’re 14 orders of magnitude away from that on demonstrated hardware.
- Harvest-now-decrypt-later makes this urgent despite the long timeline. Migrate to post-quantum schemes (ML-KEM, ML-DSA) now.
- Grover and Shor together define the post-quantum threat model: double symmetric key sizes, replace all public-key schemes.
This closes the Algorithms track. Next up: Variational algorithms — VQE, QAOA, and the hybrid classical/quantum paradigm that works on today’s noisy hardware, without waiting for fault tolerance.