2026-03-20 · Joe Henderson

Your AI Agents Are Drifting and Nobody Is Watching

The most expensive AI failure is not the one that crashes. It is the one that works perfectly for three months and then silently changes.

ai-drift ai-hallucination governance veer operations

The agent you deployed in January is not the same agent running today.

You did not change the prompt. You did not change the scope. You did not change anything. But OpenAI updated the model. Or the upstream data shifted. Or someone on the team asked the agent to handle "just one more thing" without updating the documentation.

The output looks fine. It is still producing responses. The volume is still there. Nobody has complained. But the behavior has changed in ways that are invisible without measurement — and you probably do not have the measurement in place to catch it.

This is what Waypost calls a Veer. Not a crash. Not an error. Undetected drift from intended behavior. And it is the most expensive failure mode in AI-native operations because it compounds silently. Every output produced since the drift began is suspect. You just do not know it yet.

Why Veers Are Invisible

An error is visible. The agent throws an exception, produces garbled output, or fails to respond. Someone notices. You fix it. The incident has a clear start and end.

A Veer has no clear start. The agent does not know it is drifting. Language models do not have uncertainty indicators that reliably fire when their behavior shifts. They produce confident output whether they are on track or three degrees off course. The confidence is the camouflage.

The three most common causes are model updates (the vendor changed something underneath you), upstream data shifts (the inputs changed character and the agent did not adapt), and scope erosion (the team gradually expanded what the agent handles without updating any documentation). All three are normal. All three are preventable. None of them are preventable without a governance system designed to catch them.

What Detection Actually Looks Like

Detection is not "someone notices the output seems off." That is how Veers get caught after three months of compounding damage. Detection is a system that compares current behavior against documented expected behavior on a defined cadence.

In Waypost, this system is The Grid — a measurement framework that tracks specific Coordinates (metrics) against expected ranges. When a Coordinate drifts outside its expected range, a Position Check is triggered. The Position Check compares current behavior against The Groundwork — the four governing documents that define what the agent is supposed to do.

The practical implementation is simpler than it sounds. testing.md contains golden vectors — known input-output pairs. Run those vectors on a schedule (weekly for high-consequence Posts, monthly for lower-stakes ones). If the output no longer matches, the Post has Veered. You caught it in a week instead of three months.

The second detection layer is the weekly Bearing meeting. The first agenda item is Relays — agent escalations since the last meeting. But the Bearing also includes a Waypoint status check. If a Waypoint that depends on an agent is suddenly off-track, the first question should be: has the Post Veered?

The Question to Ask Yourself Right Now

For every AI agent running in your operation: when was the last time someone verified that its current behavior matches its intended behavior? Not checked that it was running. Verified that what it is producing today matches what it was supposed to produce when you deployed it.

If the answer is "never" or "I am not sure," you likely have Veers in your system right now. The question is not whether they exist. The question is how much damage they have accumulated while nobody was measuring.

The fix is not complicated. Define what correct looks like (scope.md and testing.md). Measure against it on a cadence. Review at the Bearing. That is the system. Everything else is running blind and hoping for the best.

The Groundwork Compiler generates scope.md and testing.md in ten minutes at [waypost.run/framework/groundwork](/framework/groundwork). Start with your highest-consequence agent.

← All posts