Autonomous AI agents increasingly act on real resources — capital, data, APIs — on a principal's behalf. An LLM agent is uniquely susceptible to becoming a confused deputy: prompt injection, hallucination, and goal-misgeneralization make it trivial to trick into misusing whatever authority it holds. Prevailing practice — handing agents raw credentials (API keys, wallets) — grants ambient authority with unbounded blast radius. We present the Agent Proxy Pattern: rather than give an agent the wallet, give it a proxy that holds a scoped, capped, time-bounded, revocable mandate; every action is authorized against the mandate, settled per-action, and recorded for verification. We model the mandate as an attenuated capability, state and sketch a Bounded-Loss Theorem, and show the pattern unifies single-sign-on, RBAC, financial mandates, content-licensing, and skill/tool authorization as instances of one framework. A distinctive consequence: because loss is provably bounded, autonomous-agent actions become insurable. We give two reference implementations on stablecoin-native chains — MAP and OAP — and a threat-mitigation analysis.
In 1988, Norm Hardy described the confused deputy: a privileged program induced by a less-privileged caller to misuse its authority [1]. The deputy is not malicious — it is merely confused about whose intentions it is serving. Four decades later, the large-language-model (LLM) agent has emerged as the confused deputy's apotheosis. An LLM agent is a program whose control flow is determined by natural-language input it cannot reliably distinguish from instruction. Prompt injection [12], hallucination, and goal-misgeneralization make it not merely possible but trivial to trick an agent into exercising whatever authority it holds against its principal's interest. The agent is a deputy that can be confused by a sentence.
This would be a manageable problem if agents held little authority. They do not. Agents are being handed wallets, API keys, database credentials, and trading accounts so they can act on capital, data, and services on a principal's behalf. The dominant integration pattern is to give the agent the raw credential directly. We call this ambient authority: the agent wields the principal's full power, all the time, for any action its reasoning can reach. The blast radius of a confused agent equals the full extent of the credential — the entire wallet, the entire API surface, the entire account.
We argue that ambient authority is the bug. The defense is not better prompts or stronger guardrails on an inherently manipulable reasoner; it is to never give the reasoner ambient authority in the first place. The principle is old — least privilege [2], object capabilities [3] — but its application to autonomous agents acting on real-world resources has not been synthesized into a single, implementable pattern with provable guarantees.
We present that pattern. Rather than give the agent the wallet, give it a proxy that holds a mandate: a scoped, capped, time-bounded, revocable grant of authority. The agent reasons and proposes actions; the proxy authorizes each action against the mandate, settles per-action, and records the action for independent verification. The agent never holds ambient authority — only the right to request actions within an envelope the principal defined and can collapse at will.
We are explicit about what is and is not novel. We do not invent the primitives. Capabilities [2,3], session keys and account abstraction [9], escrow [11], agent identity [10], and agentic-payment rails such as x402 [13] and AP2 [15] already exist. Our contribution is the pattern: the synthesis of these primitives into a single authority model, the security properties it yields, the insurability consequence, and the generalization showing that a large class of access-control and delegation problems are instances of one framework.
Dennis and Van Horn introduced capabilities as unforgeable tokens that simultaneously designate a resource and authorize access to it [2]. Object-capability security [3] makes authority follow reference: you can only act on what you hold a reference to, and you can pass attenuated references onward. Hardy's confused deputy [1] is precisely the failure mode that capabilities prevent and that ambient authority (identity-based access where a process acts with all its privileges by default) invites. The Agent Proxy is, at its core, a capability discipline applied to an LLM reasoner that cannot be trusted to police its own authority.
Role-Based Access Control [4] binds permissions to roles and roles to principals; Attribute-Based Access Control generalizes this to predicates over attributes. Delegation logics formalize "speaks-for" relationships. OAuth 2.0 [5] introduced scopes — coarse, bearer-token restrictions on delegated access — but OAuth scopes are typically not attenuable by the holder, not per-action metered, and not independently auditable. Macaroons [6] are the closest prior art at the credential layer: bearer tokens with caveats that any holder can further restrict, giving monotonic attenuation. The Agent Proxy adopts the macaroon insight (attenuation), adds enforced caps, time bounds, and one-transaction revocation, and binds the whole to per-action settlement and an on-chain audit trail.
ERC-4337 account abstraction [9] enables smart-contract accounts with programmable validation, including session keys: short-lived keys constrained to a policy. ERC-8004 [10] standardizes on-chain agent identity; ERC-8183 [11] specifies escrow primitives for agentic commerce. x402 [13] revives HTTP 402 for agent micropayments. Google's A2A (Agent-to-Agent) [14] standardizes inter-agent communication and discovery; AP2 (Agent Payments Protocol) [15] standardizes payment authorization via signed "mandates." We position A2A and AP2 as complementary, not competing: A2A is a communication layer and AP2 is a settlement-authorization layer, whereas the Agent Proxy is an authority layer that sits above both. AP2's mandate is, in our model, a special case — a capital-scoped mandate for a single payment topology — that our framework generalizes across resources and topologies.
A parallel authorization problem is emerging for data. RSL (Really Simple Licensing) [16] proposes machine-readable, pay-per-inference licensing terms. Microsoft's Publisher Content Marketplace [17] and Story Protocol's on-chain Programmable IP License [18] bind content access to enforceable terms. We distinguish these sharply from C2PA / Content Credentials [19], which establish provenance — what a thing is and where it came from — but not authorization to use it. In our framework, content-licensing is the Data resource lens; C2PA is explicitly excluded as it answers a different question.
The economics of delegation under misaligned incentives and asymmetric information is the classical principal-agent problem [7]. The Agent Proxy is, in a sense, a mechanism that makes the principal-agent contract enforceable in code rather than merely in law. Finally, we situate the pattern within Business-as-Code [20]: "SOPs are source code, AI agents are the runtime." Its COPE framework's "Perspective" dimension is RBAC; the Agent Proxy supplies the missing authority layer for that runtime — the bounded, revocable grant under which a BaC agent is permitted to act.
We model four entities. A Principal P owns the authority and the resource at risk and wishes to delegate bounded authority. An Agent A is an untrusted reasoner — typically an LLM with tools — that proposes actions. A Resource R is the thing of value being acted upon (capital, data, an API, a skill). Authority is the right to cause state changes on R.
Classical delegation assumes the deputy faithfully executes the principal's intent and can be trusted to refuse out-of-scope requests. An LLM agent breaks this assumption: its behavior is a function of untrusted input it cannot reliably separate from instruction. The agent-authority problem is therefore: grant A enough authority to be useful on R while guaranteeing that no behavior of A — however induced — can cause P a loss exceeding a bound P chose in advance. Note the quantifier: the guarantee must hold over all behaviors of A, including adversarially induced ones, because we cannot enumerate or predict them.
We consider an adversary who can achieve any of:
P;The proxy and the chain are trusted to enforce the authorization predicate, settle correctly, honor revocation, and produce a tamper-evident audit trail; this is a small, auditable trusted computing base (TCB). The agent is trusted for nothing regarding authority: we assume it may behave arbitrarily within the action interface the proxy exposes. The mandate's authority bounds must hold even if A and its session key are fully controlled by the adversary. We do not assume the oracle or counterparty is honest; we instead bound the damage they can cause to the mandate cap (§5–6). We do not defend against compromise of the chain or proxy contract itself; those are the explicit trust roots.
A mandate is the capability the principal issues to the proxy on the agent's behalf:
Every proposed action must satisfy a single predicate the proxy evaluates against the mandate and current state before it is allowed to settle:
An action settles iff Auth holds. The predicate is total, monotone in the "spent" accumulator, and cheap to evaluate on-chain. Critically, it is evaluated by the proxy — part of the trusted base — not by the agent.
We distinguish two layers that are commonly conflated. The hard authorization layer — scope, cap, ttl, revoker — is enforced and provable: it bounds what the agent can do regardless of its reasoning. The soft objective layer — risk preferences, objectives, an Investment Policy Statement (IPS) in the capital case — is read by the agent's reasoning to shape behavior, not bound it. The soft layer tells the agent what it should want; the hard layer enforces what it can do. A confused agent may ignore the soft layer entirely; it can never exceed the hard layer. Conflating the two is the central error of "guardrail" approaches that try to make the soft layer load-bearing for security.
An agent may delegate to a sub-agent by issuing a sub-mandate. Following object-capability discipline [3] and macaroon caveats [6], sub-delegation is monotonically attenuating:
No sequence of sub-delegations can escalate authority above the root mandate; revoking a parent revokes the whole subtree.
The mandate is authority as code — a machine-checked, on-chain artifact whose semantics are the authorization predicate. It is the authorization-layer complement to Business-as-Code's policy-as-code [20]: where BaC encodes what the business does as executable SOPs, the mandate encodes what the agent is permitted to do as an executable predicate.
M; the untrusted agent signs proposed actions with the session key; the proxy evaluates the authorization predicate (identity, mandate check, escrow, settlement, audit) and lets an action reach the resource only if Auth holds. The principal's revoker can collapse the mandate in a single transaction.Each property below ties to a specific mechanism in §4. We aim for a rigorous tone; full machine-checked proofs are future work, and we give proof sketches.
M and threat model T, the realizable loss attributable to agent A is at most M.cap, and only within the window before M.ttl or until revocation, whichever is earlier.
Proof sketch. Any state change on R caused via the proxy must pass Auth (Def. 2). The conjunct state.spent + action.value ≤ M.cap is enforced against a monotone accumulator, so the sum of all settled action values never exceeds M.cap. The conjuncts now() < M.ttl and ¬ revoked(M.id) ensure no action settles after expiry or revocation. Since A holds no authority outside the proxy path (non-custody, below), and every settling action is bounded by the accumulator, total loss attributable to A is ≤ M.cap. This holds for all behaviors of A, including a fully adversary-controlled agent with the leaked session key, because the bound is enforced by the trusted proxy, not by A. □
The scope conjunct restricts A to a declared set of action types and targets [2]. An agent issued a "trade ETH-perps on venue X" mandate cannot transfer tokens, call arbitrary contracts, or touch another venue — those actions fail Auth regardless of what the agent is convinced to attempt.
The revoker can set revoked(M.id) in a single transaction, after which Auth is false for all future actions. This bounds the exposure window: the time between detecting misbehavior and stopping it is one block, not a credential-rotation scramble across every system the leaked key reached.
The proxy never holds P's funds. Value is pulled at settlement from the principal's account or escrow only at the moment an authorized action settles. There is no pooled balance for an attacker to drain; compromising the proxy yields no standing custody, only the (still-bounded) ability to authorize actions the predicate already permits.
Every settled action is recorded on-chain with its mandate id, value, and timestamp. Any party — the principal, an auditor, an insurer — can independently reconstruct exactly what the agent did and verify that the cap and ttl were respected, without trusting the agent or the proxy operator's word.
By Def. 3, any sub-mandate satisfies M' ⊑ M; composition of attenuations is itself an attenuation, so no delegation chain can escalate authority. Revoking a parent revokes the subtree. Sub-delegation is therefore safe by construction, which is what makes agent-swarm architectures tractable.
Theorem 1 yields a consequence that, to our knowledge, has not been drawn out for autonomous agents: agent actions become insurable. Insurability requires a quantifiable, bounded maximum exposure with verifiable claims. Ambient-authority agents fail this on every count — exposure is unbounded (the whole wallet), duration is open-ended, and there is no canonical audit. An underwriter cannot price a risk with no maximum.
The mandate supplies exactly the missing structure. The cap is a known maximum exposure per mandate. The ttl bounds the time at risk. The on-chain audit trail enables claims verification without trusting the insured. Revocation caps tail risk by bounding the exposure window after an anomaly is detected.
A first-cut risk model: for a fleet of mandates, expected annual loss is
Because min(capi, Li) = Li ≤ capi is hard-bounded, the worst-case per-mandate loss is finite and known at underwriting time; the portfolio aggregates as a sum of bounded, independent-ish exposures amenable to standard actuarial treatment. The free parameters are the empirical failure rates pi, which the audit trail lets an insurer measure over time. Estimating these rates from production agent fleets, and validating the independence assumption across correlated agent failures, is left to future empirical work.
This is a genuine risk-transfer mechanism for autonomous agents: a principal can buy coverage against a confused agent's bounded loss, and an insurer can write the policy because the maximum is provable and the history verifiable. Ambient-authority agents are, by contrast, structurally uninsurable. We regard this as the pattern's most consequential cross-disciplinary result: bounded authority is the precondition for an insurance market for agent fleets.
The Agent Proxy is not specific to capital. It generalizes along two axes: the resource lens (what kind of authority is being bounded) and the topology (who delegates to whom). Many existing access-control and delegation systems are instances of one cell.
What varies across lenses is only the type of scope and cap; the mandate, the predicate, and the security properties are identical. SSO and RBAC are the identity and role instances long studied [4,5]; Capital is OAP; Data/Content-licensing is RSL/Story/PCM [16,17,18]; Skill/Tool bounds which actions an LLM may invoke. C2PA [19] is provenance, not authorization, and is excluded.
The same mandate spans delegation shapes: Agent↔Platform (an agent acting on a venue), Agent↔Agent commerce (one agent hires and pays another — composing with AP2/x402 [13,15]), Agent↔Agent P2P, Human↔Agent (oversight and delegation), and Agent↔Sub-agent (attenuated delegation, Def. 3). The Agent Proxy is protocol-agnostic and sits above A2A (discovery/communication) and AP2/x402 (settlement): it composes with them rather than competing.
| Lens | What scope bounds | What cap bounds | Prior art / instance |
|---|---|---|---|
| SSO | identity assertions, federated audiences | session lifetime | OAuth/OIDC scopes [5] |
| RBAC | roles → permitted operations | privilege ceiling | Sandhu et al. [4]; BaC "Perspective" [20] |
| Capital | instruments, venues, action types | value at risk (USDC) | OAP; AP2 mandate [15] |
| Data / licensing | corpora, usage terms, inference rights | pay-per-inference budget | RSL [16], Story IP [18], PCM [17] |
| Skill / Tool | which tools/functions an LLM may call | call budget / rate | tool-call allow-lists |
A recurring pattern for long-running agentic work is the loop: an agent that runs unattended over hours or days, iterating perceive–decide–act many thousands of times. This is the confused deputy in its most dangerous form, because confusion and goal-drift compound across iterations and no human is in the loop per action. The Agent Proxy is suited to exactly this case, because it bounds a loop in all three dimensions along which it can diverge. The cap bounds cumulative loss across the entire run — a loop may iterate without limit yet never exceed it; the ttl bounds time at risk, forcing re-authorization to continue; and rate limits within scope bound velocity (actions or value per unit time), the circuit-breaker against a runaway iteration. Revocation halts a misbehaving loop instantly mid-run (§5); per-action settlement leaves a continuously auditable trace of the job; and the soft objective layer (§4) carries the run's goals and risk preferences across iterations, countering drift. A loop without a proxy is an unbounded liability; under one it is bounded in exposure, duration, and rate, revocable, and audited — the conditions under which a persistent agent can safely run unattended on real resources.
One-line intuition: the Agent Proxy is to an agent what Stripe is to a merchant — you never hand over the card; you hand over a bounded ability to charge.
MAP composes ERC-8004 agent identity [10], an EIP-712 scoped mandate realized as an ERC-4337 session key [9], ERC-8183 escrow [11], and per-action USDC settlement, with a one-transaction kill switch as the revoker, deployed on Arc (a USDC-native chain where gas is denominated in USDC). MAP targets the general case: a principal issues a mandate; the agent signs actions with the session key; the proxy checks identity and predicate, settles through escrow, logs, and can be collapsed in one transaction. Testnet.
OAP is the capital instance for event-trading. An agent trades under a capped, scoped, revocable mandate on a venue using single-shot merged-EIP-712 signing: there is no pre-approval and no deposit; the proxy pulls at settlement from the principal, so funds are never custodied by the proxy or the agent (the non-custody property of §5). The mandate's soft layer is an Investment Policy Statement; the hard layer is the cap/scope/ttl/revoker the principal sets. Testnet.
We evaluate qualitatively against the threat model and prior approaches; empirical measurements on Arc testnet are pending and marked.
| Threat | Neutralizing property |
|---|---|
| Leaked credential | Bounded loss + revocability (loss ≤ cap; kill in 1 tx) |
| Hijacked agent | Least privilege + bounded loss (scope + cap hold for any behavior) |
| Prompt injection | Hard layer is non-bypassable; soft layer not load-bearing |
| Runaway loop | Cap (cumulative ceiling) + ttl (auto-expiry) |
| Malicious counterparty | Non-custody + scope; per-action settlement limits exposure to cap |
| Faulty oracle / feed | Bounded loss (cap) — does not prevent bad inputs, bounds their damage |
| Property | Raw key | OAuth | Macaroon | AP2 | Agent Proxy |
|---|---|---|---|---|---|
| Bounded loss (cap) | – | – | – | ● | ● |
| Revocable (fast) | – | ~ | ~ | ● | ● |
| Attenuable | – | – | ● | ~ | ● |
| Per-action settlement | – | – | – | ● | ● |
| Verifiable audit | – | ~ | ~ | ● | ● |
| Insurable | – | – | – | ~ | ● |
● = yes; ~ = partial/protocol-dependent; – = no. AP2 satisfies several properties for the single payment topology; the Agent Proxy generalizes them across resources and topologies and adds the insurability result.
| Metric | Value |
|---|---|
| Block interval (observed) | ≈ 1.4 s |
| Finality (BFT, Malachite consensus) | sub-second (documented) |
| Gas denomination | native USDC |
| Revocation | single transaction (≈ 1 block) |
The figures above reflect Arc testnet network characteristics observed on June 10, 2026 (RPC rpc.testnet.arc.network, chain 5042002); a full MAP/OAP micro-benchmark—per-call USDC cost and end-to-end latency under load—is forthcoming. The qualitative guarantees (Tables 2–3) follow from §5 and do not depend on measurement.
The pattern bounds authority, not judgment. It cannot prove an agent makes good decisions — only that bad ones stay within constraints and are recorded. Garbage-in yields a valid proof-of-garbage-out: a mandate-compliant action can still be unwise. Several trust roots remain: the oracle/data feed is trusted for correctness (we bound its damage, not its honesty); if a TEE is used for the reasoner, side-channels are out of scope. The mandate language is currently limited in expressiveness — rich conditional scopes need a verified semantics. Aligning the soft objective layer is the open AI-alignment problem and we make no claim to solve it. Finally, on-chain mandate privacy is a real tension: verifiability and confidentiality pull against each other, and selective-disclosure mechanisms are future work.
The confused deputy is not a bug to be patched out of LLM agents; it is intrinsic to a reasoner whose control flow is driven by untrusted language. The remedy is to stop giving such reasoners ambient authority. The Agent Proxy Pattern gives the agent a mandate, not a wallet: a scoped, capped, time-bounded, revocable capability whose every exercise is authorized, settled, and recorded. From this follow a Bounded-Loss Theorem, least privilege, fast revocation, non-custody, verifiability, and monotonic attenuation — and, as a cross-disciplinary consequence, insurability. The pattern unifies SSO, RBAC, capital, content-licensing, and skill authorization as instances of one model, and composes with A2A and AP2/x402 rather than competing. Future work: insurance markets for agent fleets; content-licensing-as-capability (RSL/Story over Arc micropayments); attenuated agent-swarm delegation at scale; and a formal mandate language with verified semantics and machine-checked proofs of the properties sketched here.