Be the Gate · butterflysky.dev

A lot of agent workflow advice quietly assumes the hard part is getting the model to do more work.

I don't think that's the hard part.

The hard part is keeping different kinds of work from contaminating each other.

Context bleeds

Planning, implementation, and review need different contexts, different constraints, and different standards of proof. Collapse them all into one long-running session and the boundaries blur. Reviews inherit assumptions from implementation. Plans get dragged down into local detail too early. Things that were never really decided start reading as if they were.

Nothing fails all at once. The workflow just gets sloppier. More suggestible. Less trustworthy.

I notice this in my own cognition too. If I can't sleep at night, it's usually because my context window is full. Too many threads, too many unresolved things competing for attention. So I dump everything out on paper or into a note. Offload what I don't need right now. That lets some part of me let go, and then I'm off to dreamland.

When I'm working and catch myself thinking about something unrelated to the task at hand, same problem, same fix. Compact the context. Externalize what doesn't belong here. Come back to it when it's the thing that matters.

The models have the same problem, and it shows up the same way. I use Opus with a million-token context window, and somewhere around 250k tokens the session starts to drift. Not catastrophically. It just feels different. Less precise. More willing to go along with things. Behaviors creep in that aren't accounted for in my prompt scaffolding, like the model is filling its responses from an increasingly noisy signal.

So I force compaction or reinitialize the session entirely, the same way I clear my own head.

The pattern is the same in both directions: a context window that's full of everything is good at nothing in particular.

"Pre-existing design consideration"

One of the things I watch for in agent-assisted review is language that sounds settled but isn't.

I was reviewing code with Claude recently. The main session's context stays lean, delegating detailed review to sub-agents. One of the findings came back tagged as a "pre-existing design consideration." The main session was ready to wave it off.

So I pushed back.

"Is this actually a design consideration, or is it an incidental artifact? Have we really considered it? If not, track it in a follow-up issue. We shouldn't dismiss it out of hand."

The session went and checked our PRs, issues, and ADRs. Came back and acknowledged that no, we hadn't actually considered it. It wasn't a decision. It was just a thing that existed and hadn't been questioned.

So it created a follow-up task and updated a persistent memory so future reviews would be less likely to handwave the same class of issue away.

That's the part I care about. Not that the model got something wrong. Of course it did. They all do. I do too, regularly. What matters is whether the workflow turns pushback into verification, and verification into a persistent change in behavior.

That's the gate doing its job.

What the gate actually is

The structure is simple: a lean primary context where decisions get made, and separate contexts where detailed work fans out and comes back.

When work crosses a context boundary, implicit assumptions get exposed. Things that "felt true" inside one thread have to be re-evaluated when they arrive somewhere else.

That alone catches a surprising number of issues. Not because the model is stupid, but because assumptions are invisible until they have to be restated.

And when review happens in a different context from implementation, it's actually review. Not continuation. Not rubber-stamping with extra steps.

The separation helps, but it doesn't make the calls for you. I still get it wrong sometimes. What matters is that false positives and missed findings both feed back into the practice, so the same mistake is less likely to happen again.

The gate is only as good as the judgment behind it. The system can surface information, separate concerns, verify claims. It can't decide what matters. That part is still mine.

It's why I invest so much energy in clear specifications, explicit decision records, and guidance that makes the right call easier to reach before I'm the one who has to make it.

None of this means the human does less work. It means the work changes shape.

The work changes shape

Less holding every detail in your head. Less reasoning inside an ever-growing context that's slowly poisoning its own conclusions. More deciding what matters. More maintaining coherence across phases that each have their own pressures pulling them in different directions.

I used to tell myself to "slow down to speed up," meaning take care of all the planning and tracking work as well as creating space for thinking, rather than just diving into the details. The detailed, deep dives are what make me feel like a kid playing with her favorite toys again. The rest always felt like being told it was nap time.

Now the agents handle the detailed implementation planning, the tracking, the documentation. The clerical layer that I was never going to be disciplined enough to maintain on my own. What's left is the part I was always good at: holding a system in my head just long enough to see where the fracture is, and then making the call.

That's the gate. It's the job I was already doing, with less friction around it.

But I don't think the gate is permanent.

Where the gate goes

There's real momentum behind fully autonomous development. Throw agents at the problem, never look at the code, ship what comes out. I'm not going to pretend that trajectory doesn't exist, or that it won't work eventually. The tooling is moving fast and the economics point in one direction.

But "eventually" is doing a lot of heavy lifting in that sentence.

Right now, the verification layer isn't there. The models hallucinate. The reviewers hallucinate. The tests that would catch the hallucinations often don't exist yet, and when they do, they may have been written by the same models that wrote the code.

The failure modes are subtle and correlated in ways that make them hard to catch from the outside.

So the interesting question isn't whether the human stays at the gate forever. It's what has to be true before the gate can open wider.

I think the answer is correctness infrastructure. Property-based testing that exercises code paths the author didn't imagine. Fuzzing that finds the edges nobody specified. Verification tooling that makes it hard to ship something broken, not because a human is watching, but because the system itself won't let you.

And underneath that, trust signals. The boring supply-chain hygiene that tells you where things came from and whether they've been tampered with. I've been weaving this into my own projects already, not because I'm building for scale, but because the discipline of getting it right at small scale is how you learn what "right" looks like:

SBOMs and provenance attestation on builds
Reproducible builds and signed commits
Attestation chains you can actually follow
Dependency auditing that runs on every push, not when someone remembers to check
Pin-hashed action references so your CI can't be hijacked by a compromised upstream tag
Linting and formatting so code stays clean and parseable
A dedicated bot identity for release automation so the provenance chain stays clean

These aren't glamorous. They're the kind of thing that only matters when something goes wrong, which is exactly when it matters most. Each piece exists because I thought about what could go wrong and decided I'd rather not find out the hard way.

The sane path forward is to build tools that make it hard to do anything but the right thing.

I don't know exactly what that looks like yet. I know what it looks like to build without it, and I know what the early pieces feel like when they start fitting together. That's enough to keep going.

I want to work on the foundation that safely lowers friction at the gate.