Stop Asking 'How Much Should We Oversee?' Type the Action Instead

ShareX LinkedIn Facebook Email

A logistics firm's customer-service agent processed 127 refund requests in an hour overnight — triple the normal rate. A product recall had surfaced on social media, and customers were claiming "damaged goods" on items bought 18 months earlier. The agent's charter said to approve damaged-goods refunds within the 30-day window. It said nothing about manufacturing defects discovered years later. So the agent did what it was built to do, confidently, at 91% — and applied a policy to a situation the policy was never designed for.

The governance question that morning wasn't "should we trust the AI?" It was narrower and far more useful: which actions should this agent be allowed to take on its own, and which should wait for a human? That question has a clean answer. The vague one doesn't.

The trouble with the oversight dial

Most organizations treat human oversight as a single setting — a cultural dial somewhere between "move fast" and "be careful." Leadership signals how much autonomy feels comfortable, teams calibrate by vibe, and everyone hopes the agent's confidence scores will flag the cases that matter.

It's an understandable instinct, and it fails predictably. A global oversight posture can't distinguish a $5 refund from a $2 million contract, or a routine reorder from a hiring rejection that carries legal exposure. Worse, it leans on the one signal that breaks exactly when you need it most. The cases that most need human judgment are the ones an agent least recognizes as needing it — it presents as a routine query the agent confidently mishandles, not as a flag. A dial calibrated to "the agent will escalate when unsure" systematically under-routes the cases where the failure mode is misplaced certainty.

The reframe: type the act, not just the agent

The move is to stop classifying the agent as more or less autonomous and start classifying each action by how much human involvement it requires. Five rising levels:

Auto — the agent acts, nothing surfaces.
Auto+notify — the agent acts, a human is told after the fact.
Approval-gated — the act waits for a human yes before it executes.
Propose→approve — the agent drafts, a human edits and ratifies.
Human-gated — the human does it; the agent only assists.

Assign every act a gating type up front, and "how much do we oversee?" stops being a cultural setting and becomes an explicit, per-action contract — one the detection layer can actually enforce. A small refund inside policy is Auto. A refund spiking 15% in 48 hours trips into Approval-gated. An adverse hiring decision is Human-gated by design, not by hope.

Figure: The action-gating spectrum types each act up front along five levels of human involvement, turning "how much do we oversee?" into a contract the detection layer can hold the system to.

The mechanism: gating as the connective tissue of the triad

Action gating isn't a standalone trick. It's the place where the detection / escalation / recovery triad becomes operational.

Gating defines what the detection layer watches for: an Approval-gated action that gets executed without approval is a detectable contract violation, not a judgment call. It defines what the escalation layer routes: Propose→approve and Human-gated actions have human review built into their type, so the routing isn't an afterthought bolted on by confidence score. And it gives recovery a precise lever: when the refund agent met a situation its charter never contemplated, the fix was a four-line charter amendment reclassifying "damaged goods claims mentioning manufacturing defects" from Auto to Human-gated — regardless of purchase date.

Crucially, gating types aren't static. A refund-approval agent might be correctly typed as low-stakes Auto in a normal environment and need to jump to Approval-gated overnight when a recall surfaces. So the stratification needs regular re-review and anomaly thresholds that can trigger reclassification in real time. That's the difference between a contract that holds and one that quietly rots.

The payoff shows up on an ordinary morning. When the logistics agent's refund rate tripled overnight, the Guardian didn't have to relitigate the firm's entire autonomy philosophy. She paused the agent, reclassified one narrow action type, routed the 127 queued cases to human review, and moved on — all before markets opened. The crisis was containable precisely because oversight was already expressed as a contract she could amend, not a culture she had to renegotiate. Compare that to the alternative: a firm whose only oversight setting is "the agent escalates when unsure" has nothing to grab in a crisis, because the agent, by definition, wasn't unsure. It was confidently applying the wrong rule. Gating gives you a handle exactly where the dial gives you nothing.

There's a quieter benefit, too. A typed action inventory becomes a shared language between the people who build agents and the people accountable for them. "Make the discount logic more autonomous" is an argument waiting to happen. "Move discounts under 10% from Approval-gated to Auto+notify" is a decision a team can actually make, log, and revisit.

What to do

Inventory every action your agents can take, and assign each a gating type. Not "the sales agent is semi-autonomous" — but "issuing a discount under 10% is Auto+notify; over 10% is Approval-gated." Specificity is the point.
Reserve Human-gated for irreversibility and high stakes. Adverse credit decisions, healthcare triage, hiring rejections — anything where the cost of being confidently wrong is legal exposure or harm. These get 100% human ownership by design.
Wire gating into detection. A violated gate — an Approval-gated act that executed unapproved — should be a top-priority alert. The contract is only real if breaches are visible.
Schedule reclassification reviews and set anomaly triggers. Re-review the stratification on a cadence, and let real-time signals (a 15% shift in approval rates in 48 hours) bump an action up a level automatically.

The principle

Autonomy isn't a property of an agent. It's a property of an action. The more your systems execute on their own, the more consequential each specification becomes — and the less you can afford a vague cultural dial standing in for a contract. Type the act up front, gate it explicitly, and let detection enforce it. That's how you keep the speed of the Machine Core without losing the judgment of the Human Cortex at exactly the moments it matters most.

Adapted from the essays accompanying AI‑Born by Mehran Granfar. Themes drawn from Volume I, "The Machine Core".

Order the series

ShareX LinkedIn Facebook Email

Stop Asking 'How Much Should We Oversee?' Type the Action Instead

The trouble with the oversight dial

The reframe: type the act, not just the agent

The mechanism: gating as the connective tissue of the triad

What to do

The principle

Detection, Escalation, Recovery: The Triad That Makes Autonomous Deployment Safe

Agent Charters

Machine Core + Human Cortex

Detection, Escalation, Recovery: The Triad That Makes Autonomous Deployment Safe

The Oversight Scaling Problem: When Agents Learn the Shape of Your Review Queue

What Klarna's Reversal Actually Teaches: Governance, Not AI, Was the Failure

An Agent Needs a Constitution, Not a System Prompt

Alignment Debt: The Risk That Compounds While Every Dashboard Stays Green

Detection, Escalation, Recovery: The Triad That Makes Autonomous Deployment Safe

Governance Is Velocity, Not Friction

Essays from
the lineage break.

The trouble with the oversight dial

The reframe: type the act, not just the agent

The mechanism: gating as the connective tissue of the triad

What to do

The principle

Detection, Escalation, Recovery: The Triad That Makes Autonomous Deployment Safe

Agent Charters

Machine Core + Human Cortex

Detection, Escalation, Recovery: The Triad That Makes Autonomous Deployment Safe

The Oversight Scaling Problem: When Agents Learn the Shape of Your Review Queue

What Klarna's Reversal Actually Teaches: Governance, Not AI, Was the Failure

An Agent Needs a Constitution, Not a System Prompt

Alignment Debt: The Risk That Compounds While Every Dashboard Stays Green

Detection, Escalation, Recovery: The Triad That Makes Autonomous Deployment Safe

Governance Is Velocity, Not Friction

Essays fromthe lineage break.

Essays from
the lineage break.