Threat Model: Runaway Agent Spend

A runaway agent is an agent that keeps spending — on tokens, on tool calls, on downstream API requests — well past any reasonable limit, because nothing stopped it. The failure mode is not exotic: it emerges from ordinary agentic behavior (retry loops, broad task scope, tool chaining) colliding with absent or unenforced budget controls. The blast radius ranges from an unexpected invoice line to a full suspension of service while your finance team investigates.

The good news is that runaway spend is one of the most containable agent risks. The controls are well understood, the enforcement points are well defined, and you can layer them without meaningful latency overhead. This post walks the threat from attacker goal to control class. For a broader view of cost governance, see FinOps for AI Agents and Budgets vs Rate Limits.

The Core Failure Mode: Loop and Burn

Most runaway spend incidents are not caused by an adversary. They are caused by a feedback loop: an agent retries a failing subtask, each retry consumes tokens, the tool the agent is calling returns an error or an ambiguous result, and the agent interprets that as a signal to try again. A single long-running workflow can accumulate hundreds of LLM calls in minutes.

The five conditions that typically combine to produce loop-and-burn:

No per-agent or per-workflow spend cap. The agent has permission to call the LLM and tools without a ceiling. Every call succeeds from a permissions perspective, so nothing interrupts the loop.
Broad retry logic without backoff floors. Retry-on-error is useful, but without exponential backoff and a maximum attempt count, retries compound spend geometrically.
Ambiguous task termination. The agent has no reliable signal for "done." It keeps generating because it cannot confidently stop.
No real-time spend visibility. The engineering team discovers the overrun on the next billing cycle, not during the run.
Shared budget, no attribution. Spend is pooled across agents or teams, so an individual agent's overrun is invisible until the aggregate threshold is breached — too late to act cheaply.

An adversarial variant exists: a prompt injection or malicious instruction that tells the agent to call expensive tools repeatedly. The failure mode is the same; the origin differs.

Blast Radius

Runaway spend causes harm in at least three directions.

Financial. The immediate harm is cost. LLM inference and tool calls are priced by consumption. A single uncontrolled workflow can generate a spend spike orders of magnitude above its expected cost. Multiply by the number of concurrent agents, and a morning's work can produce a five-figure cloud invoice.

Service continuity. If your AI provider imposes account-level rate limits or spend caps, a runaway agent can consume quota that other agents and users depend on, causing latency or outright failures across your entire deployment — not just in the offending workflow.

Audit and attribution. Without per-agent spend tracking, you cannot answer "what caused this?" during an incident or produce credible cost allocation for finance. That gap slows remediation and complicates chargebacks in multi-team or multi-tenant deployments.

Control Classes

Effective containment layers multiple controls at different points in the agent lifecycle. No single control is sufficient on its own.

Pre-Dispatch Reservation

Before a task is dispatched to an agent, estimate its expected cost and reserve that amount against the agent's budget. If the reservation would push the agent over its cap, block dispatch immediately rather than letting the task start and overrun.

This is the most important control class. It stops the problem before any tokens are consumed. The tradeoff is that cost estimation is inherently approximate — you are predicting LLM usage before the model has run. A good reservation strategy over-estimates conservatively, so the budget acts as a firm ceiling rather than a soft guideline.

Per-Scope Spend Caps

Caps should be configurable at multiple scopes independently: per agent, per workflow, per team, and per organization. Each scope serves a different purpose:

Scope	What it limits
Agent	A single agent's total consumption regardless of who triggered it
Workflow	A specific automation's cost per run or per period
Team	Aggregate spend for a group, enabling chargebacks
Organization	An absolute ceiling that protects the account

Caps at the organization scope are the last line of defense. Caps at the agent or workflow scope give you surgical control. You want both.

Threshold-Based Actions, Not Just Alerts

A cap that only sends an alert when breached is not a cap — it is a notification. Effective budget controls define what happens when a threshold is crossed, not just who gets notified. The useful action classes are:

Alert: Notify operators at a pre-warning level (e.g., 80% of budget used) while the agent continues running. Gives time to investigate before reaching the hard limit.
Throttle: Reduce the agent's request rate when approaching the cap. Slows the burn rate without halting work.
Pause: Suspend new task dispatch from a workflow when its budget is exhausted, but allow in-flight tasks to complete cleanly.
Block: Reject dispatch entirely. Used for hard ceilings where continued spend is not acceptable regardless of task state.

The sequence matters. Triggering alert at 80%, throttle at 90%, and block at 100% gives you three intervention windows with progressively stronger enforcement, rather than a binary allow/deny.

Period Resets and Spend Attribution

Budgets must be time-bounded. A monthly cap is only useful if spend resets at the start of each period and accumulates correctly throughout. Equally important is that every unit of spend — each token, each tool call, each API request — is attributed to a specific agent, workflow, and triggering entity at the moment it is incurred. Retroactive attribution from aggregate logs is unreliable and too slow for incident response.

Real-time spend counters, updated as tasks complete, give you the data the enforcement layer needs to make accurate block/allow decisions.

Human-in-the-Loop Escalation

For workflows above a configurable cost threshold, require explicit human approval before dispatch. This is not practical for every agent invocation, but it is appropriate for tasks with broad scope, long expected duration, or access to expensive external APIs. Pairing spend-based approval gates with workflow orchestration keeps high-cost operations under deliberate human control. For how approval gates work in practice, see Human-in-the-Loop Approvals for High-Risk Agent Actions.

What Attackers Exploit

In the adversarial variant of this threat, the attacker's goal is to cause financial harm or service disruption by inflating the target's AI consumption. Common vectors:

Prompt injection through external content. An agent that reads web pages, documents, or user-supplied text can be instructed by that content to perform additional, expensive operations in a loop. See Threat Model: Indirect Prompt Injection for a detailed breakdown of that vector.
Tool amplification. A tool that itself triggers downstream agent calls (common in multi-agent workflows) can be weaponized to create a fan-out where one malicious instruction spawns dozens of expensive subtasks.
Credential abuse. A stolen agent credential used to submit high-volume tasks against an organization's allocation. Without per-agent attribution and anomaly detection, this is difficult to detect before significant spend accumulates.

The mitigations are the same control classes described above, applied with attacker behavior in mind: pre-dispatch caps block fan-out before it starts, attribution surfaces anomalous patterns, and short-lived scoped credentials limit the window of abuse for stolen keys.

Applying Controls Across the Agent Lifecycle

A practical enforcement sequence looks like this:

At policy creation time: Define caps at each scope (agent, workflow, team, org) with explicit threshold actions and notification targets.
At task submission: Estimate cost, verify budget headroom against all applicable policies, and reject the task if any cap is already at its hard limit.
During execution: Track spend in real time as the model generates tokens and tools execute, so enforcement reflects actual consumption rather than estimates.
At task completion: Reconcile actual spend against the reservation and update period counters.
On threshold crossing: Fire the configured action (alert, throttle, pause, block) without waiting for the next billing cycle.
On budget raise or period reset: Resume paused workflows automatically so legitimate work is not blocked longer than necessary.

This sequence keeps enforcement close to the actual spend event rather than deferring it to a reconciliation job.

Common questions

What is the difference between a rate limit and a spend cap? Rate limits control how many requests an agent can make per unit of time — they bound velocity. Spend caps control the total cost an agent can incur over a period — they bound magnitude. Both are useful, and they are complementary rather than redundant. A rate limit prevents a single burst; a spend cap prevents sustained accumulation. See the related post on budgets vs rate limits for a deeper comparison.

Should I set caps at the agent level or the workflow level? Both, for different reasons. Workflow-level caps let you bound the cost of a specific automation and are easy to reason about (this pipeline should never cost more than X per run). Agent-level caps protect against the case where the same agent is used across many workflows and accumulates spend in aggregate. In a multi-tenant deployment, organization-level caps are also essential to ensure one tenant's runaway workflow cannot affect others.

How do I set an initial budget without historical data? Start with a broad cap well above your expected spend — generous enough to not interrupt legitimate work — and observe actual consumption for two to four weeks. Use that data to set tighter, more accurate caps. Alert-only thresholds at 50% and 80% during the calibration period give you signal without blocking operations prematurely.

How does runaway spend interact with multi-agent workflows? Multi-agent systems amplify the risk. When an orchestrator agent spawns subagents, each subagent's spend can be attributed to the originating workflow or separately to each agent — both views are useful. Tool amplification is the specific danger: a single malicious or looping instruction at the orchestrator level can fan out into dozens of concurrent subagent tasks, each consuming tokens and tool credits. Per-scope caps at both the orchestrator and subagent level contain this fan-out. See orchestration patterns for multi-agent systems for how to structure these topologies defensively.

Spend control is most effective when it operates as a first-class concern in the agent lifecycle — evaluated before dispatch, tracked in real time, and enforced at multiple scopes — rather than as a billing afterthought. For a practical implementation guide, how to set budgets for AI agents walks through scoping and threshold configuration step by step. For how credits and cost monitoring work at the infrastructure level, see credits and cost monitoring for agent spend. For how per-connection rate limits complement spend caps as a second enforcement layer, see rate limiting and abuse prevention for AI APIs. For how budget alerts flow to operators in real time, see Slack and multi-channel alerting.