Quantum Outpost
gates and circuits intermediate · 24 min read ·

OpenQASM 3 and Your First Real Hardware Run

Qiskit circuits are a convenience. OpenQASM 3 is the portable assembly language underneath — and what you actually send to hardware. This tutorial walks through the OpenQASM 3 syntax that matters, IBM Quantum's free tier, transpilation, and how to interpret noisy results honestly on your first real-hardware run.

Prerequisites: Tutorial 6: Multi-Qubit Gates

Every Qiskit QuantumCircuit is ultimately serialized to OpenQASM 3 before it hits real hardware. QASM is the portable assembly language for quantum — readable by humans, writeable by machines, vendor-neutral across IBM, IonQ, Quantinuum, and anything else that takes the standard. Knowing QASM 3 is the difference between “I wrote a quantum program” and “I understand what the hardware is actually doing.”

This tutorial teaches you the subset of QASM 3 you’ll actually use, then walks you through running your first circuit on IBM Quantum’s free tier, inspecting calibration data, and interpreting the results without deceiving yourself.

Why QASM 3 (not QASM 2 or anything else)

Historical context in one minute: OpenQASM 2 (2017) was the original standard. It got you far, but lacked real-time classical control flow, parameterized gates, and custom calibrations. QASM 3 (stabilized 2021, hardware-adopted 2023–2024) fixes all three. Every new feature on IBM’s Runtime primitives (dynamic circuits, mid-circuit measurement with branching, classical registers as first-class values) lives in QASM 3.

IonQ, Rigetti, Quantinuum, and most simulators accept QASM 3 as input. If you want your circuits to be portable across vendors, OpenQASM 3 is the interchange format.

A complete QASM 3 program

Here is the Bell-state circuit in OpenQASM 3:

OPENQASM 3.0;
include "stdgates.inc";

qubit[2] q;
bit[2]   c;

h q[0];
cx q[0], q[1];
c = measure q;

Every piece is worth knowing:

  • OPENQASM 3.0; — version declaration; always line 1.
  • include "stdgates.inc"; — pulls in standard gate definitions (H, CX, X, Y, Z, Rx, Ry, Rz, …). Without this you’d have to define h yourself, which is rarely what you want.
  • qubit[2] q; — declare a quantum register named q with 2 qubits.
  • bit[2] c; — classical register of 2 bits, the measurement destination.
  • h q[0]; — Hadamard on qubit 0.
  • cx q[0], q[1]; — CNOT, control = qubit 0, target = qubit 1.
  • c = measure q; — measure the full register; results go into classical bits.

Things you didn’t see in QASM 2:

  • Real-time control flow. if (c == 1) { x q[0]; } — branch on a classical measurement result mid-circuit.
  • Parameterized gates. gate rx(θ) q { ... } — define custom gates with parameters.
  • Classical integer arithmetic. int[32] n; n = 5; — you can do classical computation in the middle of a program.
  • Timing control. delay[100ns] q[0]; — insert a deterministic delay for decoherence studies.

Exporting and importing in Qiskit

Round-trip a Qiskit circuit through QASM 3:

from qiskit import QuantumCircuit, qasm3

qc = QuantumCircuit(2, 2)
qc.h(0)
qc.cx(0, 1)
qc.measure([0, 1], [0, 1])

# Qiskit → QASM 3 string
qasm_src = qasm3.dumps(qc)
print(qasm_src)
# OPENQASM 3.0;
# include "stdgates.inc";
# bit[2] c;
# qubit[2] q;
# h q[0];
# cx q[0], q[1];
# c[0] = measure q[0];
# c[1] = measure q[1];

# QASM 3 string → Qiskit
qc_restored = qasm3.loads(qasm_src)
assert qc_restored.data == qc.data

For vendor interoperability, dump to QASM 3, pipe into a non-Qiskit tool, and load back. This is how benchmarks like MQT Bench and QASMBench stay language-neutral.

Getting an IBM Quantum account

Free tier: 10 minutes of quantum compute per month, which is an eternity for learning (each circuit shot takes microseconds). Sign up flow:

  1. Go to quantum.ibm.com and create a free account.
  2. On your dashboard, copy your API token. Store it somewhere safe — it’s the credential you’ll paste below.
  3. Note which open plan backends are available. At the time of writing these include ibm_brisbane, ibm_kyiv, and ibm_sherbrooke (127 qubits each, Eagle r3 processors).

Save credentials once

from qiskit_ibm_runtime import QiskitRuntimeService

QiskitRuntimeService.save_account(
    channel="ibm_quantum_platform",
    token="YOUR_TOKEN_HERE",                # paste once, then delete from source
    instance="ibm-q/open/main",             # free open-plan instance
    overwrite=True,
)

After this, the token lives in ~/.qiskit/qiskit-ibm.json and every subsequent QiskitRuntimeService() call loads it automatically — no more pasting.

Pick a backend and inspect it

from qiskit_ibm_runtime import QiskitRuntimeService

service = QiskitRuntimeService()

# List operational backends
for b in service.backends(operational=True, simulator=False):
    print(f"{b.name:24s}  qubits={b.num_qubits}  queue={b.status().pending_jobs}")
# ibm_brisbane             qubits=127  queue=12
# ibm_sherbrooke           qubits=127  queue=41
# ibm_kyiv                 qubits=127  queue=3     ← pick this one

backend = service.backend("ibm_kyiv")

# Calibration snapshot
props = backend.properties()
q0 = props.qubit_property(0)
print(f"Qubit 0: T1 = {q0['T1'][0]*1e6:.1f} µs, T2 = {q0['T2'][0]*1e6:.1f} µs, "
      f"readout error = {q0['readout_error'][0]:.2%}")
# Qubit 0: T1 = 185.3 µs, T2 = 132.5 µs, readout error = 0.73%

Every real-hardware run should start with a calibration check. T1T_1 (energy relaxation time) and T2T_2 (dephasing time) tell you how long the qubit holds quantum information. Readout error tells you how reliably a measurement turns the physical state back into a classical bit. These numbers drift hour-to-hour — don’t assume Tuesday’s calibrations apply Wednesday.

Transpile for the hardware

Before you can run on ibm_kyiv, Qiskit has to rewrite your circuit in terms of the hardware’s native gate set (ecr, id, rz, sx, x on Eagle r3) and respect its qubit connectivity. This is transpilation.

from qiskit import QuantumCircuit, transpile

qc = QuantumCircuit(2, 2)
qc.h(0)
qc.cx(0, 1)
qc.measure([0, 1], [0, 1])

# Transpile for the specific backend, at optimization level 3 (the most aggressive)
tqc = transpile(qc, backend=backend, optimization_level=3)
print(f"Original gates: {dict(qc.count_ops())}")
print(f"Transpiled:     {dict(tqc.count_ops())}")
# Original gates: {'h': 1, 'cx': 1, 'measure': 2}
# Transpiled:     {'rz': 2, 'sx': 2, 'ecr': 1, 'measure': 2}

ecr is the Echoed Cross-Resonance gate, IBM’s native two-qubit gate. Every CNOT in your circuit becomes one ecr plus a few single-qubit rotations. On a deeper circuit with limited connectivity, SWAP chains would appear — the transpiler inserts them as needed to route qubits together.

Inspect which physical qubits your logical qubits got mapped to:

print(tqc.layout.initial_layout)
# Layout: mapping from logical qubits in the circuit to physical qubits on chip
# e.g., logical 0 -> physical 14, logical 1 -> physical 13

The transpiler picks qubits with the best calibration numbers by default. That’s usually what you want.

Run the job

from qiskit_ibm_runtime import SamplerV2 as Sampler

sampler = Sampler(mode=backend)
job = sampler.run([tqc], shots=4096)
print("Job ID:", job.job_id())
# Come back in a few minutes when the queue clears
result = job.result()

pub_result = result[0]
counts = pub_result.data.c.get_counts()
print(counts)
# {'00': 1827, '11': 1803, '01': 239, '10': 227}

Two things to notice immediately:

  • 00 and 11 are dominant. That’s the Bell-state signature: perfect correlation between the two qubits.
  • 01 and 10 are not zero. On a perfect quantum computer they would be exactly zero. On real hardware they appear because of (a) gate errors during H and CX, (b) decoherence during the ~1 µs circuit execution, and (c) readout misclassification.

The fraction (01+10)/total(01\text{+}10)/\text{total} on this run is about 11% — higher than you’d like, but not unusual for an early free-tier run on a shared machine. For comparison, on Quantinuum H2 you’d see under 1%.

Reading out error rates honestly

How do you distinguish “the algorithm is working but the hardware is noisy” from “the algorithm is wrong”? Three disciplines, in order of effort:

  1. Run on a simulator first. Always. AerSimulator() gives you the ideal answer; if your circuit doesn’t produce Bell-state correlations in simulation, no hardware run will save you.

    from qiskit_aer import AerSimulator
    sim = AerSimulator()
    ideal = sim.run(tqc, shots=4096).result().get_counts()
    print(ideal)
    # {'00': 2053, '11': 2043}   ← exactly 0 on '01' and '10'
  2. Measurement-error mitigation. A post-processing step that inverts the confusion matrix between physical and reported bit strings. Qiskit Runtime supports this via sampler.options.resilience.measure_mitigation = True. Turn it on for any quantitative claim.

  3. Dynamical decoupling. Insert identity-preserving pulses during idle time to fight slow drifts. sampler.options.dynamical_decoupling.enable = True. Costs a bit of extra time; usually worth it.

from qiskit_ibm_runtime import SamplerV2 as Sampler

sampler = Sampler(mode=backend)
sampler.options.resilience.measure_mitigation = True
sampler.options.dynamical_decoupling.enable = True
sampler.options.dynamical_decoupling.sequence_type = "XpXm"

job = sampler.run([tqc], shots=4096)

After mitigation, the 01 and 10 fractions typically drop to 3–5% on Eagle r3 — noticeably better, still nonzero.

The mental checklist for every real-hardware run

  1. Does the circuit work on a simulator? If not, stop.
  2. Is the calibration recent (last 24 hours)? Is readout error on the involved qubits under 2%? If not, try another backend or wait.
  3. Did you transpile with optimization_level=3 for the specific backend? Default is 1 and much worse.
  4. Is your two-qubit gate count under ~100 for a meaningful result on today’s machines? If not, consider simulation or VQE-style ansatz truncation.
  5. Do you have at least 4096 shots to beat binomial noise on probability estimates to ~1.5%?
  6. Did you turn on measurement-error mitigation? It’s nearly free.
  7. Did you sanity-check the raw counts before plotting? Outlier shots can mask real effects.

Every real-hardware workflow follows this checklist. Violate any step and you’ll spend hours debugging an effect that isn’t there.

Full runnable example

End-to-end, from nothing to hardware-measured Bell state:

from qiskit import QuantumCircuit, transpile
from qiskit_ibm_runtime import QiskitRuntimeService, SamplerV2 as Sampler
from qiskit_aer import AerSimulator

# 1. Define the circuit
qc = QuantumCircuit(2, 2)
qc.h(0)
qc.cx(0, 1)
qc.measure([0, 1], [0, 1])

# 2. Simulator sanity check
ideal = AerSimulator().run(qc, shots=4096).result().get_counts()
print("Ideal:", ideal)

# 3. Pick a backend and transpile
service = QiskitRuntimeService()
backend = service.least_busy(operational=True, simulator=False, min_num_qubits=2)
tqc = transpile(qc, backend=backend, optimization_level=3)
print(f"Using backend: {backend.name}, transpiled ops: {dict(tqc.count_ops())}")

# 4. Run with mitigation
sampler = Sampler(mode=backend)
sampler.options.resilience.measure_mitigation = True
sampler.options.dynamical_decoupling.enable = True
job = sampler.run([tqc], shots=4096)
print(f"Job submitted: {job.job_id()}")

# 5. Wait, fetch, compare
result = job.result()
counts = result[0].data.c.get_counts()
print("Real:", counts)

# 6. Report a single honest metric
total = sum(counts.values())
error_fraction = (counts.get("01", 0) + counts.get("10", 0)) / total
print(f"Off-diagonal fraction (should be 0): {error_fraction:.2%}")

Exercises

1. Read the QASM

Parse this QASM 3 circuit by hand and say what it does:

OPENQASM 3.0;
include "stdgates.inc";

qubit[3] q;
bit[3]   c;

h q[0];
cx q[0], q[1];
cx q[1], q[2];
c = measure q;
Show answer

Creates a 3-qubit GHZ state: H on qubit 0 produces +00|+\rangle \otimes |00\rangle; cascading CNOTs propagate to 12(000+111)\tfrac{1}{\sqrt{2}}(|000\rangle + |111\rangle); measurement collapses to either 000 or 111 with 50/50 probability.

2. Round-trip through QASM

Write a 5-qubit QFT circuit in Qiskit, export it to QASM 3, print the output, then load it back and verify the round-trip produces an equivalent circuit.

Show answer
from qiskit import QuantumCircuit, qasm3
from qiskit.circuit.library import QFT
from qiskit.quantum_info import Operator
import numpy as np

qc = QuantumCircuit(5)
qc.append(QFT(5), range(5))
src = qasm3.dumps(qc.decompose())
qc2 = qasm3.loads(src)
print(np.allclose(Operator(qc.decompose()).data, Operator(qc2).data))
# True

3. Estimate error contribution

You run a 100-CNOT circuit on a machine with 0.8% per-CNOT error, 0.05% per single-qubit-gate error, and 1.5% readout error per measured qubit. Estimate the probability of a “clean” circuit (no gate error), and the probability of a fully clean measurement for a 5-qubit readout.

Show answer

Gate-level: 0.9921000.4490.992^{100} \approx 0.449 (CNOTs) ×0.9995400\times 0.9995^{400} (assume 4 single-qubit gates per CNOT, 400 total) 0.449×0.8190.367\approx 0.449 \times 0.819 \approx 0.367. So ~37% of shots are free of gate errors. Readout: 0.98550.9270.985^5 \approx 0.927 — about 93% of the time all 5 qubits are correctly read out. Joint probability of a perfect shot: 0.34\sim 0.34. In 4096 shots, you’d expect only ~1400 shots reflecting the ideal circuit. This is why shot budgets matter.

4. Write QASM 3 for a Bell measurement in the X basis

Write QASM 3 that prepares Φ+|\Phi^+\rangle and measures both qubits in the X-basis (applying H before measurement).

Show answer
OPENQASM 3.0;
include "stdgates.inc";

qubit[2] q;
bit[2]   c;

h q[0];
cx q[0], q[1];
h q[0];
h q[1];
c = measure q;

Expected result on an ideal simulator: 00 or 11 with 50/50 probability (Bell state is correlated in the X basis too).

What you should take away

  • OpenQASM 3 is the vendor-neutral assembly language. Read it. Write it when portability matters.
  • Transpilation is non-negotiable. Always transpile against the specific backend at optimization_level=3.
  • Calibration drifts. Check T1T_1, T2T_2, readout error before every real-hardware run.
  • Simulator first, hardware second, every time. This will save you dozens of hours over your first year.
  • Measurement-error mitigation + dynamical decoupling are nearly-free wins — enable both.
  • The 7-point pre-run checklist is the difference between useful experiments and noise harvesting.

That closes the Gates & Circuits track. You now have enough infrastructure to implement real algorithms. Next up in the Algorithms track: Deutsch-Jozsa (the original quantum speedup), Bernstein-Vazirani, Grover’s search, and the Quantum Fourier Transform — each derived from scratch with runnable code on both simulators and real hardware.


Weekly dispatch

Quantum, for people who already code.

One serious tutorial per week, plus the industry moves that actually matter. No hype, no hand-waving.

Free. Unsubscribe anytime. We will never sell your email.