algorithms advanced · 28 min read · April 22, 2026

Shor's Algorithm: Factoring, Order-Finding, and the End of RSA

Shor's factoring algorithm reduces integer factorization to the problem of finding the multiplicative order of a random element mod N — and uses quantum phase estimation to solve that in polynomial time. This tutorial derives the full algorithm, runs a small instance in Qiskit, and honestly assesses the real-world resource requirements to break RSA-2048.

Prerequisites: Tutorial 11: QFT and Phase Estimation

Shor’s algorithm is the reason governments take quantum computing seriously. Published in 1994 by Peter Shor at Bell Labs, it factors $n$ -bit integers in time polynomial in $n$ — exponentially faster than any known classical algorithm. That threat directly undermines RSA, the public-key cryptography protecting essentially all internet encryption, banking, and signed software. NIST’s post-quantum standardization (ML-KEM, ML-DSA, 2024) exists because of Shor.

This tutorial derives the algorithm carefully. It’s genuinely subtle — much more than “apply QFT, get answer.” The quantum part does one specific thing (find the multiplicative order of an element mod $N$ ), and the factoring follows from classical number theory. Getting the composition right matters.

The classical problem

Given a composite integer $N$ , find a nontrivial factor. The best known classical algorithm is the General Number Field Sieve (GNFS), with complexity

\exp\left(O\!\left(\,\sqrt[3]{\tfrac{64}{9}} \cdot (\log N)^{1/3} \cdot (\log\log N)^{2/3}\,\right)\right).

For $N$ at 2048 bits (the RSA-2048 standard), GNFS is estimated to require on the order of $10^{34}$ operations — infeasible on any classical computer humanity will build this century.

Shor’s quantum algorithm runs in $O((\log N)^2 (\log\log N) (\log\log\log N))$ time — polynomial in the number of digits. On a sufficiently large fault-tolerant quantum computer, RSA-2048 falls in hours.

The reduction: factoring → order-finding

Shor reduces factoring to a problem in number theory that turns out to be tractable for quantum computers. Here’s the reduction.

Order of an element. For $a$ coprime to $N$ , the multiplicative order of $a$ mod $N$ is the smallest positive integer $r$ such that $a^r \equiv 1 \pmod N$ . The classical gcd and Euclidean-algorithm operations are cheap. The hard part is computing $r$ .

Classical number-theoretic fact. If $r$ is even and $a^{r/2} \not\equiv -1 \pmod N$ , then $\gcd(a^{r/2} - 1, N)$ and $\gcd(a^{r/2} + 1, N)$ are both nontrivial factors of $N$ . (Because $a^r - 1 = (a^{r/2} - 1)(a^{r/2} + 1) \equiv 0 \pmod N$ , and under the conditions neither factor is a multiple of $N$ .)

Probabilistic reduction. Pick a random $a$ with $1 < a < N$ , $\gcd(a, N) = 1$ (if $\gcd > 1$ , you already found a factor — stop). Find $r$ = order of $a$ . With probability at least $1/2$ , $r$ is even and $a^{r/2} \not\equiv -1$ . In that case, factorize using the gcd formulas. Else, try another $a$ .

Everything but “find $r$ ” is classical and polynomial. The quantum speedup is entirely in the order-finding.

Order-finding as phase estimation

Define the unitary $U_a$ acting on $\lceil \log_2 N \rceil$ qubits:

U_a\,|x\rangle = |ax \pmod N\rangle.

(We extend $U_a$ to act as identity on $|x\rangle$ for $x \geq N$ , making it unitary on the full $2^{\lceil \log N \rceil}$ -dim space.)

Two key facts:

$U_a$ ‘s eigenvalues are $r$ -th roots of unity. Its order is $r$ : $U_a^r = I$ .
Its eigenvectors are related to the roots of unity. Specifically, for each $k \in \{0, 1, ..., r-1\}$ , define

|u_k\rangle = \frac{1}{\sqrt{r}}\sum_{j=0}^{r-1} e^{-2\pi i k j / r}\,|a^j \bmod N\rangle.

Then $U_a|u_k\rangle = e^{2\pi i k / r}|u_k\rangle$ .

QPE on $U_a$ with eigenvector $|u_k\rangle$ yields $\varphi = k/r$ , from which $r$ can be recovered classically.

The superposition trick

We can’t easily prepare $|u_k\rangle$ — we don’t know $r$ or $k$ . But a magic identity:

\frac{1}{\sqrt{r}}\sum_{k=0}^{r-1}|u_k\rangle \;=\; |1\rangle.

(Expand the sum; the $k$ -sum projects to $j = 0$ .) So if you prepare $|1\rangle$ in the eigenvector register and run QPE, you effectively run QPE on a uniform superposition of all $|u_k\rangle$ . The counting-register measurement returns a random $\varphi \approx k/r$ for a uniformly random $k \in \{0, ..., r-1\}$ .

From $k/r$ to $r$ : continued fractions. If the measurement returns $\varphi$ , apply the continued-fraction expansion to find the rational $p/q$ closest to $\varphi$ with $q < N$ . With high probability, $q = r$ (or a divisor of it). If not, try again.

The full algorithm

Pick random $a \in \{2, ..., N-1\}$ with $\gcd(a, N) = 1$ .
Use quantum phase estimation on $U_a$ with input state $|1\rangle$ and $2 \lceil \log N \rceil$ counting qubits.
Measure the counting register; get a binary fraction $\varphi$ .
Run continued-fraction expansion to extract $r$ .
Verify classically that $a^r \equiv 1 \pmod N$ . If not, go back to step 2.
If $r$ is odd or $a^{r/2} \equiv -1 \pmod N$ , go back to step 1.
Compute $\gcd(a^{r/2} \pm 1, N)$ . Return the nontrivial factors.

Total quantum queries expected: $O(1)$ repetitions of step 2 (each succeeds with constant probability). Total classical work: polynomial in $\log N$ . Total circuit depth per quantum query: dominated by the controlled modular exponentiations, $O((\log N)^3)$ or so.

Example: factor $N = 15$

$N = 15 = 3 \times 5$ . Pick $a = 7$ (coprime to 15). Order of 7 mod 15: $7^1 = 7, 7^2 = 49 \equiv 4, 7^3 \equiv 28 \equiv 13, 7^4 \equiv 91 \equiv 1$ . So $r = 4$ (even). $a^{r/2} = 7^2 = 49 \equiv 4$ . Not $-1 \pmod{15}$ . Great.

$\gcd(4 - 1, 15) = \gcd(3, 15) = 3$ . $\gcd(4 + 1, 15) = \gcd(5, 15) = 5$ . $15 = 3 \times 5$ . Done.

For the quantum part, we need to implement controlled- $U_a^{2^k}$ gates on 4 target qubits ( $\lceil \log_2 15 \rceil = 4$ ). For this small instance, we can hardcode the multiplication circuits. For general $N$ we’d need a proper modular-exponentiation circuit, which is the expensive part.

Qiskit implementation (toy scale)

Because the general modular exponentiation is complex to build from gates, and Qiskit’s didactic Shor class was removed in newer versions, we’ll use the manual QPE approach with a hand-built $U_7 \bmod 15$ circuit. This is not scalable — but it’s honest about what the real algorithm does.

import numpy as np
from fractions import Fraction
from math import gcd
from qiskit import QuantumCircuit, transpile
from qiskit_aer import AerSimulator
from qiskit.circuit.library import QFT


def c_amod15(a: int, power: int) -> QuantumCircuit:
    """Controlled multiplication by a^power mod 15 on 4 target qubits.

    Uses the known permutation structure for a ∈ {2, 4, 7, 8, 11, 13}.
    """
    if a not in (2, 4, 7, 8, 11, 13):
        raise ValueError("a must be coprime to 15 (one of 2, 4, 7, 8, 11, 13).")

    U = QuantumCircuit(4)
    for _ in range(power):
        if a in (2, 13):
            U.swap(2, 3); U.swap(1, 2); U.swap(0, 1)
        if a in (7, 8):
            U.swap(0, 1); U.swap(1, 2); U.swap(2, 3)
        if a in (4, 11):
            U.swap(1, 3); U.swap(0, 2)
        if a in (7, 11, 13):
            for q in range(4): U.x(q)
    controlled = U.to_gate(label=f"{a}^{power} mod 15").control(1)
    return controlled


def qpe_order_finding(a: int, N: int = 15, n_count: int = 8) -> QuantumCircuit:
    qc = QuantumCircuit(n_count + 4, n_count)
    # Counting register into superposition
    qc.h(range(n_count))
    # Target register starts |0001⟩ = |1⟩
    qc.x(n_count)
    # Controlled powers of U_a
    for k in range(n_count):
        qc.append(c_amod15(a, 2 ** k), [k, *range(n_count, n_count + 4)])
    # Inverse QFT on counting register
    qc.append(QFT(n_count, inverse=True, do_swaps=True), range(n_count))
    qc.measure(range(n_count), range(n_count))
    return qc


def shor_factor(N: int, attempts: int = 10) -> tuple[int, int] | None:
    if N % 2 == 0:
        return (2, N // 2)
    sim = AerSimulator()
    for _ in range(attempts):
        a = np.random.choice([2, 4, 7, 8, 11, 13])
        g = gcd(int(a), N)
        if g > 1:
            return (int(g), N // int(g))

        qc = qpe_order_finding(int(a), N)
        tqc = transpile(qc, sim)
        counts = sim.run(tqc, shots=1).result().get_counts()
        bits = next(iter(counts))
        phi = int(bits, 2) / 2 ** len(bits)
        frac = Fraction(phi).limit_denominator(N)
        r = frac.denominator
        if r % 2 != 0 or r == 1: continue
        candidate = pow(int(a), r // 2, N)
        if candidate == N - 1: continue
        f1 = gcd(candidate - 1, N)
        f2 = gcd(candidate + 1, N)
        if 1 < f1 < N: return (int(f1), N // int(f1))
        if 1 < f2 < N: return (int(f2), N // int(f2))
    return None


np.random.seed(42)
factors = shor_factor(15)
print(f"15 = {factors[0]} × {factors[1]}")
# 15 = 3 × 5

The circuit has $8 + 4 = 12$ qubits and a few hundred gates after transpilation. On a real IBM Quantum free-tier machine, results are noisy but recognizable — you’ll often need several repetitions for a clean answer. On a simulator, success is near-certain.

What it would take to factor RSA-2048

Three resource estimates from peer-reviewed work (Gidney & Ekerå 2021 is the standard reference):

Resource	Estimate
Logical qubits	~4,100
Physical qubits (assuming surface code at $10^{-3}$ gate error)	~20 million
Wall-clock time	~8 hours
Total Toffoli-equivalent operations	~ $3 \times 10^9$
Total T-gates (post-distillation)	~ $10^{10}$

As of 2026, the largest known Shor-type factoring demonstration on real hardware is factoring 21 (Wang et al., 2024) using hybrid variational methods on a 5-qubit device; pure Shor demonstrations have been stuck at $N = 15$ since Vandersypen et al. 2001. The gap between “15” and “2048-bit” is about 14 orders of magnitude in qubit count.

Google’s Willow (105 physical qubits, Dec 2024) is the first chip to demonstrate below-threshold surface-code error correction. IBM’s Starling roadmap (200 logical qubits by 2029) and Blueprint (2033) are the first credible paths toward thousands of logical qubits. Most experts estimate RSA-2048 falls somewhere between 2030 and 2040.

What post-quantum cryptography actually changes

Shor breaks RSA, DSA, ECDSA, Diffie-Hellman, and all integer-factoring or discrete-log-based public-key cryptography. It does not break:

Symmetric ciphers like AES-256, ChaCha20 (Grover gives a quadratic, not exponential, speedup — doubling key size restores security).
Hash functions like SHA-256, SHA-3 (Grover quadratic speedup — use 256-bit hashes for 128-bit post-quantum security).
Lattice-based schemes like ML-KEM (CRYSTALS-Kyber) and ML-DSA (CRYSTALS-Dilithium) — NIST’s standardized PQC.
Code-based (Classic McEliece), hash-based (SPHINCS+), isogeny-based (SIKE — though SIKE was classically broken in 2022).

The NIST PQC migration is specifically about replacing the Shor-vulnerable primitives while keeping the Grover-resistant ones at twice the key length. This is tractable for most infrastructure — but non-trivial, because every TLS implementation, every signed software update chain, every hardware security module has to be audited and upgraded.

Exercises

1. Verify the order-finding

For $N = 21$ and $a = 4$ , compute the order $r$ classically. Verify $r$ is even and $a^{r/2} \not\equiv -1$ . Then compute the factors via $\gcd(a^{r/2} \pm 1, N)$ .

Show answer

$4^1 = 4, 4^2 = 16, 4^3 = 64 \equiv 1 \pmod{21}$ . So $r = 3$ . Odd! Retry with another $a$ .

Try $a = 5$ : $5^1 = 5, 5^2 = 25 \equiv 4, 5^3 = 20, 5^4 = 100 \equiv 16, 5^5 = 80 \equiv 17, 5^6 = 85 \equiv 1$ . $r = 6$ , even. $5^{3} = 20 \not\equiv -1 \equiv 20 \pmod{21}$ . Hmm, actually $-1 \equiv 20$ , so $a^{r/2} \equiv -1$ — bad case, retry.

Try $a = 2$ : $2^1 = 2, 2^2 = 4, 2^3 = 8, 2^4 = 16, 2^5 = 32 \equiv 11, 2^6 = 22 \equiv 1$ . $r = 6$ , even. $2^3 = 8 \neq 20$ . Good. $\gcd(8 - 1, 21) = 7$ , $\gcd(8 + 1, 21) = 3$ . $21 = 3 \times 7$ . ✓

2. Continued fractions

Suppose QPE on $U_a \pmod N$ with 10 counting qubits returns $\varphi = 0.3332$ . Use the continued-fraction algorithm to extract candidate denominators $r$ bounded by $N = 100$ .

Show answer

$0.3332 = [0; 3, 3003, ...]$ . The convergents are $0, 1/3, ...$ . The first convergent with denominator ≤ 100 is $1/3$ . So candidate $r = 3$ . Verify: is $a^3 \equiv 1 \pmod N$ ? If yes, $r = 3$ .

In Python:

from fractions import Fraction
Fraction(0.3332).limit_denominator(100)
# Fraction(1, 3)

3. Why $2n$ counting qubits

Why does Shor require $2\lceil \log N \rceil$ counting qubits rather than $\lceil \log N \rceil$ ?

Show answer

The phase $\varphi = k/r$ has denominator up to $r \leq N$ . To distinguish it from neighboring fractions in continued-fraction post-processing, you need precision at least $1/(2r^2) \sim 1/(2N^2)$ . That requires $\lceil \log_2(2N^2) \rceil = 2\lceil \log N \rceil + 1$ counting qubits. Shortcuts exist (Kitaev’s 1-qubit iterative QPE) but cost more repetitions.

4. Estimate RSA-1024

If RSA-2048 takes ~20M physical qubits and ~8 hours, roughly what would RSA-1024 take? (Use that factoring $n$ -bit numbers is $O(n^3)$ in gates, and that RSA-1024 is already considered broken classically by well-resourced adversaries.)

Show answer

Halving $n$ roughly divides gate count by 8 and qubit count by 2. So RSA-1024: ~10M physical qubits, ~1 hour wall-clock. Still infeasible on 2026 hardware but closer — this is why RSA-1024 has been deprecated for classical reasons since ~2013 and is a “canary” target for the first credible Shor demos.

What you should take away

Shor reduces factoring to order-finding. The quantum speedup is entirely in finding the order of $a \pmod N$ via QPE.
QPE on the modular-multiplication unitary recovers $k/r$ for random $k$ ; continued fractions extract $r$ .
Real RSA-2048 attacks need ~20M physical qubits and ~8 hours wall-clock on fault-tolerant hardware. We’re 14 orders of magnitude away from that on demonstrated hardware.
Harvest-now-decrypt-later makes this urgent despite the long timeline. Migrate to post-quantum schemes (ML-KEM, ML-DSA) now.
Grover and Shor together define the post-quantum threat model: double symmetric key sizes, replace all public-key schemes.

This closes the Algorithms track. Next up: Variational algorithms — VQE, QAOA, and the hybrid classical/quantum paradigm that works on today’s noisy hardware, without waiting for fault tolerance.

The classical problem

The reduction: factoring → order-finding

Order-finding as phase estimation

The superposition trick

The full algorithm

Example: factor N=15N = 15N=15

Qiskit implementation (toy scale)

What it would take to factor RSA-2048

What post-quantum cryptography actually changes

Exercises

1. Verify the order-finding

2. Continued fractions

3. Why 2n2n2n counting qubits

4. Estimate RSA-1024

What you should take away

Quantum, for people who already code.

Example: factor $N = 15$

3. Why $2n$ counting qubits