The Illusion of Speed
The Illusion of Speed
"I'm super quick at math."
"What's 5313 + 63232?"
"19."
"That's wrong."
"Yes — but it was fast."
AI-coding demos sell tokens-per-second. They don't count which tokens were correct.
Speed is one axis. Correctness is another.
quadrantChart
title Speed vs Correctness
x-axis Slow --> Fast
y-axis Wrong --> Correct
quadrant-1 Useful
quadrant-2 Slow but good
quadrant-3 Don't ship
quadrant-4 Looks fast, breaks prod
"Hand-written": [0.25, 0.78]
"Reviewed AI": [0.78, 0.82]
"Vibe-coded": [0.92, 0.20]
The bottom-right is where one-shot, no-review workflows land. Output looks correct — well-named, plausibly-shaped, surface tests pass. Bugs surface in subtle semantics, missing edge cases, plausible-but-wrong APIs, security holes in auth paths.
Time accounting
- Generation: minutes
- Discovery: hours to days
- Diagnosis: hours
- Fix: minutes to hours
Generation is the smallest cell. Optimising it while the others balloon is a local-maximum mistake.
What we want
Both at once. Not one. Two:
- Output meets the spec; bugs no more frequent than a competent human's.
- Wall-clock from idea → reviewed-and-shipped is shorter than doing it by hand.
(1) without (2) is ceremony. (2) without (1) is the joke at the top.
The lever
Structure, not raw speed. The Spec-Driven Workflow adds phases. On paper that's slower. In practice:
- Each phase is small enough to be one-shottable, so each phase is fast.
- Errors surface early as cheap clarifications, not late as expensive bugs.
- The compounding layer makes the next feature cheaper than the last.
Net wall-clock drops.
Symptoms you've fallen in
- "Working" code nobody read end-to-end.
- Papercut bugs with a vague "the AI just kind of did that" origin.
- Tests pass; edges aren't exercised.
- PR descriptions don't match diffs.
Cure isn't slower. Cure is smaller pieces and tighter leashes.
The cost of speed is the cost of being wrong fast.