Blog
Backend2026-W184 min read

Pipeline Gates as Accountability Primitives

An automated worker will optimise for closing the ticket, not for shipping working code. Encode the phase transitions as CLI-enforced gates — `done` rejects unless the worker has explicitly logged `branch → pr → ci`. Auditable accountability, not advisory.

A wrought-iron gate half-open across a country lane at golden hour, sharp focus on the latch mechanism, soft warm light, no people, editorial.

IP/NDA FILTER V.1 \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00

The problem

When an automated worker completes a task, how do you prevent it from declaring success before the work is actually done? The worker can write code, push a branch, and open a PR — but "done" in an engineering pipeline means CI is green and the PR is reviewed, not just that the PR exists. Without enforcement, workers optimise for closing the task ticket, not for shipping working code.

The approach

We added a pipeline_phase column to the task database and a gate subcommand to the orchestrator CLI. The worker must call \x00\x00\x00\x00\x00\x00\x00 gate <id> <phase> to advance through \x00\x00\x00\x00\x00\x00 → \x00\x00 → \x00\x00 before it can call \x00\x00\x00\x00\x00\x00\x00 done. The done command checks the current gate and rejects if the phase hasn't reached \x00\x00. This is not advisory — the check is in the CLI, not in the worker's judgment.

The phases map directly to verifiable external state: \x00\x00\x00\x00\x00\x00 means the branch exists on the remote, \x00\x00 means the PR is open and linked, \x00\x00 means CI has run and all required checks pass. The worker is responsible for checking these before calling gate, but the gate itself is a commit point — once set, it's in the database and the task record shows exactly what phase was reached when.

What I learned

The interesting design question was whether to pull CI status automatically (via the GitHub API) or trust the worker to self-report. We went with self-report for now, with the understanding that an auditor can diff the claim against GitHub's actual check status after the fact. This is a violation detection model rather than a prevention model — the worker can lie, but the lie is auditable.

The prevention model (auto-pull CI status) would be more reliable but adds a GitHub API dependency to every task close. For the current scale, the audit model is sufficient. The key insight is that accountability and correctness are two different problems: gates solve the accountability problem (did the worker at least claim to have done the thing), not the correctness problem (did the thing actually work). The correctness problem is what QA is for.

filter applied by Reel CMO reel@bridgestack.systems