← All frameworks
ArchitectureVol I · Ch 5

The PRAL Loop

Perceive, Reason, Act, Learn — the four-stage cycle that defines what makes an agent an agent rather than a tool, and the unit of work inside the Machine Core.

ShareXLinkedInFacebookEmail

Definition

The PRAL Loop is the four-stage cycle by which an autonomous agent operates: Perceive (enrich context from the environment), Reason (analyze scenarios and evaluate options), Act (execute via APIs and tools within defined bounds), and Learn (refine from outcomes). The cycle runs continuously. An agent perceives environmental context, reasons about goals and constraints, acts autonomously within boundaries you define, and learns from outcomes to improve the next cycle. PRAL is the taxonomic line between a tool and an agent: a tool executes instructions; an agent perceives, reasons, acts, and learns on its own.

Why it exists / the problem it solves

The word "agent" gets applied to everything from a chatbot to a macro, which makes it nearly useless as an engineering category. PRAL exists to draw the line precisely. The distinction in Chapter 5 is operational, not rhetorical: a tool waits for a command and runs it; an agent closes its own loop. It senses that something has changed, decides what to do about it within its mandate, takes the action, and folds the result back into how it will behave next time. Without that closed loop you have automation. With it, you have a worker — which is exactly why the rest of the AI-Born operating system (charters, escalation, governance) is needed at all.

PRAL also gives the [[machine-core-human-cortex|Machine Core]] a common unit of work. Every specialist agent, however different its domain, runs the same four-beat cycle. That uniformity is what lets thousands of agents be coordinated, monitored, and held to account through one shared mental model.

There is a lineage here worth naming, because it explains why the loop has the shape it does. PRAL is a descendant of John Boyd's OODA loop — Observe, Orient, Decide, Act — developed for fighter-pilot decision-making in the 1960s through 1980s and since absorbed into wide swaths of decision theory. Boyd's insight was that an actor who cycles through the loop faster than an adversary operates inside the adversary's decision loop, acting on a reality the opponent hasn't yet registered. PRAL inherits that intuition and adds the stage Boyd's pilots could not formalize: Learn. A pilot improves across a career; an agent folds the outcome of each cycle back into its behavior within the cycle itself. That fourth beat is what turns a fast decision-maker into a compounding one.

Anatomy

The loop has four stages, each with a specific job:

  • Perceive — context enrichment. The agent gathers the inputs it needs to act: customer history, the current policy version, known issue patterns, live signals from the environment.
  • Reason — scenario analysis. The agent evaluates resolution options against its goals and the constraints in its [[agent-charters|charter]].
  • Act — execution. The agent takes action through APIs and tools, but only within its chartered authority (issue a credit if justified; escalate if the situation requires authority it doesn't hold).
  • Learn — refinement. The outcome feeds back, improving the agent's future performance.

Chapter 5's diagram (Figure 5.3) adds two structural details that matter in production: a nested retry sub-loop for handling action failure, and two escalation paths to the Human Cortex — one triggered by repeated action failure, the other by output that drifts from chartered intent. The loop, in other words, is not a tidy circle that always completes. It knows how to retry, and it knows when to stop and hand the decision to a human.

Figure: The four-beat agent cycle in production. The loop is not a tidy circle: a retry sub-loop handles action failure, and two escalation paths hand the decision to a human when actions keep failing or when output drifts from chartered intent.

How it works in practice

Chapter 5 traces a single inquiry through Sierra's operation. A customer inquiry arrives. A customer-service [[vp-agent-architecture|VP-Agent]] receives the intake signal and routes the inquiry — based on complexity, customer tier, and issue type — to an appropriate specialist agent. The specialist then runs PRAL in full: it perceives context (customer history, current policy version, known issue patterns), reasons about resolution options, acts within its chartered authority (issue a credit if justified, escalate if the situation needs authority it lacks), and learns from the outcome. The resolution then feeds back through the Evaluate stage of the [[ipre-pipeline|IPRE Pipeline]]; if outcomes match intent the cycle completes normally, and if the VP-Agent's chartered objectives are producing unintended consequences — resolution speed up but repeat contacts also up, suggesting superficial fixes — the evaluator flags the tension for the Human Cortex.

The architecture earlier in Chapter 5 grounds the same loop in a 3.2-second decision cycle for an individual service agent: perception of the situation, decision among options, action through tools, and a memory of the outcome that sharpens the next encounter. The point of the worked example is that PRAL is not an abstraction layered on top of the work — it is the work, repeated thousands of times a day across the Machine Core.

The taxonomy this loop draws becomes clearer at the boundaries. A spreadsheet macro perceives nothing and learns nothing; it executes. A dashboard alert perceives and acts, but reasons trivially and never learns. A recommendation engine reasons and even learns in aggregate, but does not act autonomously in the world. Only when all four stages are present — and present for the same agent, on the same task — does the system cross from tool to agent. That crossing is not a marketing distinction. It changes what governance you owe the system: a tool needs testing, but an agent needs a [[agent-charters|charter]], because an entity that decides and acts on its own can do harm no test anticipated, and learn its way deeper into that harm before anyone notices.

How to apply it

  1. Audit your "agents" against the four stages. If a system cannot perceive change, reason about options, act on its own, and learn from the result, it is a tool. Calling it an agent will lead you to over-trust it.
  2. Design the Act stage against the charter, not against capability. What an agent can do and what it is authorized to do are different. Bind Act to the decision-rights ladder in its [[agent-charters|charter]].
  3. Build the two escalation paths in from the start. Decide in advance what counts as repeated action failure and what counts as drift from intent — and route each to a human deliberately, not by accident.
  4. Close the Learn stage into the Evaluate stage. A loop that acts but never learns is automation wearing an agent's name. Connect outcomes back through the [[ipre-pipeline|IPRE Pipeline]] so the organization, not just the agent, learns.

Failure modes / misuse

  • Open loops mislabeled as agents. A system that perceives and acts but never learns will repeat the same mistakes confidently. The Learn stage is not optional.
  • Unbounded Act. Letting the Act stage do whatever the model is capable of, rather than what the charter permits, is how an agent freelances past its authority.
  • Missing escalation paths. Without explicit triggers for repeated failure and for intent-drift, an agent either escalates everything or nothing — both of which defeat the purpose.
  • Treating PRAL as proprietary. It is not original to this book, and over-claiming it weakens the genuinely original frameworks built around it (see Origin note).

Relationship to other frameworks

PRAL is the smallest unit of work inside the [[machine-core-human-cortex|Machine Core]]. A [[vp-agent-architecture|VP-Agent]] runs PRAL at the orchestration scale (planning horizons of days and weeks) while specialist agents run it at the task scale (seconds). The Act stage is bounded by [[agent-charters|Agent Charters]]; the Learn stage feeds the Evaluate stage of the [[ipre-pipeline|IPRE Pipeline]]; and the aggregate speed at which these loops close across the organization is measured as [[iteration-half-life|Iteration Half-Life]]. PRAL is the cellular metabolism; the other frameworks are the physiology built on top of it.

Origin note

EXISTING FRAMEWORK — properly cited, not original to this manuscript. The PRAL (Perceive–Reason–Act–Learn) cycle appears in contemporary agentic-AI literature (documented in KPMG's 2025 work on agentic AI, among others) and builds on earlier decision frameworks, most directly John Boyd's OODA loop (Observe–Orient–Decide–Act) from the 1960s–1980s. This book uses PRAL to illustrate the taxonomic shift from tools to agents; it does not claim authorship of it. It should not be presented as an original AI-Born framework.

One of the frameworks running through AI‑Born by Mehran Granfar. Developed across Volume I, "The Machine Core".

Further reading
From the books
  • Book 1, Chapter 5 — "What the OS Looks Like Running: The Integration Loop" (Figure 5.3, the PRAL Loop with retry sub-loop and escalation paths) and "The Architecture of Artificial Workers."
  • Prior art: KPMG, *Agentic AI: From Automation to Agency* (2025); John Boyd's OODA loop.
The Dispatch — N°01

Essays from
the lineage break.

New essays, framework studies, excerpts and pre‑order news. Sent rarely. Never noise.