The Illusion of Speed

The Illusion of Speed

"I'm super quick at math."

"What's 5313 + 63232?"

"19."

"That's wrong."

"Yes — but it was fast."

AI-coding demos sell tokens-per-second. They don't count which tokens were correct.

Speed is one axis. Correctness is another.

quadrantChart
    title Speed vs Correctness
    x-axis Slow --> Fast
    y-axis Wrong --> Correct
    quadrant-1 Useful
    quadrant-2 Slow but good
    quadrant-3 Don't ship
    quadrant-4 Looks fast, breaks prod
    "Hand-written": [0.25, 0.78]
    "Reviewed AI": [0.78, 0.82]
    "Vibe-coded": [0.92, 0.20]

The bottom-right is where one-shot, no-review workflows land. Output looks correct — well-named, plausibly-shaped, surface tests pass. Bugs surface in subtle semantics, missing edge cases, plausible-but-wrong APIs, security holes in auth paths.

Time accounting

  • Generation: minutes
  • Discovery: hours to days
  • Diagnosis: hours
  • Fix: minutes to hours

Generation is the smallest cell. Optimising it while the others balloon is a local-maximum mistake.

What we want

Both at once. Not one. Two:

  1. Output meets the spec; bugs no more frequent than a competent human's.
  2. Wall-clock from idea → reviewed-and-shipped is shorter than doing it by hand.

(1) without (2) is ceremony. (2) without (1) is the joke at the top.

The lever

Structure, not raw speed. The Spec-Driven Workflow adds phases. On paper that's slower. In practice:

  • Each phase is small enough to be one-shottable, so each phase is fast.
  • Errors surface early as cheap clarifications, not late as expensive bugs.
  • The compounding layer makes the next feature cheaper than the last.

Net wall-clock drops.

Symptoms you've fallen in

  • "Working" code nobody read end-to-end.
  • Papercut bugs with a vague "the AI just kind of did that" origin.
  • Tests pass; edges aren't exercised.
  • PR descriptions don't match diffs.

Cure isn't slower. Cure is smaller pieces and tighter leashes.

The cost of speed is the cost of being wrong fast.