Product & MissionJune 8, 20265 min read

The Bug That Was Already Fixed: A Lesson in Deployment Integrity

We opened a sprint to fix ~55 reported bugs and found most were already fixed in the repo — the live site was running behind it. The most important fix wasn't code; it was a process to never re-debug a shipped fix again.

We started a major round with a list of ~55 reported failures and four console-log captures dated that day. The plan was root-cause first: fix the shared infrastructure, then each tool. Then we did the unglamorous step the plan demanded — confirm each hypothesis against the actual code before writing a fix — and found something humbling.

Most of the bugs were already fixed

Tool after tool, the fix the report asked for was already in the repository:

The upgraded vision model? Already wired, with a fallback.
The editor canvas not re-rendering on background colour change? The corrected effect was already there.
The video editor "opening to an upload wall"? It already mounted the editor program.
The "tool that just blurs the image"? It already applied a real transform.

The live console errors were real — but they weren't proof those fixes were absent. They were proof the deployed build was running behind the code. Fixes were committed but not fully shipped, or a CDN was serving stale bundles. We were about to re-debug, for the Nth time, code that was already correct.

How we proved it

Two checks settled it:

A live delivery audit. We fetched every model URL and runtime asset against production, with our real origin header. All returned HTTP 200, correct content-length, proper CORS. The "dead CDN / 404 / 503" theory behind the headline bug was disproved at the network layer — the model delivery was healthy.
A code-vs-symptom diff. For each reported symptom, we located the code path and checked whether the guard/fix existed. Overwhelmingly, it did.

Conclusion: the highest-value action wasn't another round of per-tool patches. It was redeploy the current code and purge the cache, then re-test against a build that actually contains the fixes.

The process change that ends the cycle

A bug that's "fixed" in the repo but not live will get re-reported, and re-debugged, indefinitely. So we wrote the rule into our postmortem: after every deploy, before concluding a fix "didn't work," verify it's actually live. Concretely:

Run the live delivery audit (assets 200 with correct sizes).
Open the tool with execution-provider telemetry on and confirm the new code path actually ran.
Only then, if the symptom persists on a confirmed-fresh build, treat it as an open bug.

We also added the observability to make this fast: per-session EP telemetry, loud typed errors at failure boundaries, and integrity probes for model assets. The point is to make "is this fixed and live?" a question you can answer in seconds, not a thing you assume.

The takeaway

The most expensive bugs aren't always in the code — sometimes they're in the gap between the code and what's actually serving. If you find yourself fixing the same thing repeatedly, check that your last fix is deployed before you write the next one. The cheapest fix in this entire round was a redeploy and a habit.

deployment devops cdn process postmortem

Found this useful?

← Back to Blog