Budgets
A budget is a ceiling that AgentKavach enforces in memory before every call. When the ceiling is reached, the agent is terminated and subsequent calls are rejected. AgentKavach enforces three budget dimensions:
- Cost. A spending limit in US dollars, available over three periods: daily (resets at midnight UTC), monthly (resets on the first of the month), and total (never resets). Cost is also poolable across every agent in an organization with
Budget.org_budget(...). - Token count. A limit on the total tokens an agent may consume, set with
max_tokens_per_run. - Duration. A limit on how long an agent may run, set with
max_runtime_seconds.
Each dimension is enforced independently. The first ceiling an agent reaches terminates it. The sections below document each dimension in turn.
Dimensions #
Threshold alerts evaluate each dimension independently.
| Dimension | Unit | Configured via |
|---|---|---|
cost | USD | budget=Budget.daily / .monthly / .total |
tokens_total | tokens | max_tokens_per_run=... |
duration | milliseconds | max_runtime_seconds=... |
ℹ️ Per-dimension alerts
Bind a channel to a dimension by setting budget_type on the ChannelConfig. The channel fires only when usage on that dimension crosses the threshold.
Budget.daily(limit) #
from agentkavach import AgentKavach, Budget
guard = AgentKavach(
provider="openai",
api_key="ak_prod_...",
llm_key="sk-...",
budget=Budget.daily(50), # $50/day
)
response = guard.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
print(f"Spent: ${guard.spent:.4f}")
print(f"Remaining: ${guard.remaining:.4f}")
print(f"Utilization: {guard.engine.utilization:.1%}")A daily budget resets at midnight UTC. Use it to cap per-day spend.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
limit | float | Yes | — | USD limit per day. Must be > 0. |
Budget.monthly(limit) #
guard = AgentKavach(
provider="anthropic",
api_key="ak_prod_...",
llm_key="sk-ant-...",
budget=Budget.monthly(500), # $500/month
)A monthly budget resets on the 1st at midnight UTC. Match it to your cloud billing cycle.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
limit | float | Yes | — | USD limit per month. Must be > 0. |
Budget.total(limit) #
guard = AgentKavach(
provider="google",
api_key="ak_prod_...",
llm_key="AIza...",
budget=Budget.total(1000), # $1,000 lifetime cap
)A total budget never resets. Once the limit is reached, the agent stays blocked until you raise the limit.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
limit | float | Yes | — | USD lifetime limit. Must be > 0. Never resets. |
Budget.org_budget(limit, period) #
from agentkavach import AgentKavach, Budget
# Per-agent cap: $10/day
# Org-wide cap: $50/day across every agent
org_pool = Budget.org_budget(limit=50, period="daily")
guard = AgentKavach(
provider="openai",
api_key="ak_prod_...",
llm_key="sk-...",
agent_name="research-bot",
budget=Budget.daily(10), # per-agent
org_budget=org_pool, # shared
)
response = guard.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
print(f"Agent spent: ${guard.spent:.4f}")
print(f"Agent remaining: ${guard.remaining:.4f}")An org budget applies across every agent in your org. Each call counts toward the shared pool. When the org limit is reached, every agent stops.
Org budgets coexist with per-agent budgets. The agent has its own limit and shares the pool. The most restrictive limit wins.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
limit | float | Yes | — | USD limit for the entire organization. Must be > 0. |
period | str | Period | No | "daily" | Budget period: "daily", "monthly", or "total". Defaults to "daily". |
ℹ️ Most restrictive wins
When both budgets are set, the agent stops as soon as either limit is reached. An agent with a $10/day budget stops at $10 even if the org pool has room. If the org pool hits $50, every agent stops even if their individual budgets have room.
Individual vs. org budgets #
Two levels of enforcement work independently or together.
| Feature | Individual budget | Org budget |
|---|---|---|
| Scope | Single agent | All agents in the org |
| Set via SDK | budget=Budget.daily(10) | org_budget=Budget.org_budget(50) |
| Set via YAML | agents.<name>.budget.daily | org_budget.limit / org_budget.period |
| Set via SDK / YAML | Budget.daily(...) / agents.<name>.budget | Budget.org_budget(...) / org_budget: |
| Set via API | POST /v1/agents/<name>/budgets | POST /v1/org/budgets |
| Enforcement | Blocks this agent | Blocks every agent in the org |
| Ingest rejection | 429 budget_exceeded | 429 org_budget_exceeded |
Example: Both budgets together
from agentkavach import AgentKavach, Budget
# Org-wide cap: $50/day across all agents
org = Budget.org_budget(limit=50, period="daily")
research = AgentKavach(
provider="openai",
api_key="ak_prod_...",
llm_key="sk-...",
agent_name="research-bot",
budget=Budget.daily(20), # individual
org_budget=org, # shared
)
support = AgentKavach(
provider="anthropic",
api_key="ak_prod_...",
llm_key="sk-ant-...",
agent_name="support-bot",
budget=Budget.daily(15), # individual
org_budget=org, # shared
)
# research-bot stops at $20
# support-bot stops at $15
# Both stop if combined org spend hits $50Token cap (per run) #
A cost budget caps dollars over a period; a token cap limits the total tokens a single run may consume. Set max_tokens_per_run on the client and AgentKavach sums input and output tokens across every call in the run. When the running total would exceed the cap, the SDK raises TokenLimitError before the next call goes out. It is the safety net for a prompt that balloons or a loop that keeps appending context.
from agentkavach import AgentKavach, Budget
from agentkavach.exceptions import TokenLimitError
guard = AgentKavach(
provider="openai",
api_key="ak_prod_...",
llm_key="sk-...",
budget=Budget.daily(50),
max_tokens_per_run=100_000, # stop the run at 100k total tokens
)
try:
response = guard.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Summarize this very long document..."}],
)
except TokenLimitError as e:
print(f"Token cap hit: {e.spent}/{e.limit} tokens")| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
max_tokens_per_run | int | No | None | Total input + output tokens allowed in a single run. None disables the cap. |
ℹ️ Per run, not per period
The token cap counts tokens for the life of one AgentKavach instance, not on a daily or monthly clock. It is a guardrail against a single runaway run, complementary to the cost budget rather than a replacement for it.
Duration cap (per run) #
A duration cap limits how long a run may keep making calls. Set max_runtime_seconds on the client; AgentKavach measures wall-clock time from the first call, and once the elapsed time crosses the cap it raises RuntimeLimitError instead of starting another call. Use it to stop an agent that is stuck retrying or working far longer than a task should take.
from agentkavach import AgentKavach, Budget
from agentkavach.exceptions import RuntimeLimitError
guard = AgentKavach(
provider="openai",
api_key="ak_prod_...",
llm_key="sk-...",
budget=Budget.daily(50),
max_runtime_seconds=120, # stop the run after 2 minutes of calls
)
try:
while True:
guard.create(
model="gpt-4o",
messages=[{"role": "user", "content": "next step"}],
)
except RuntimeLimitError as e:
print(f"Runtime cap hit: {e.elapsed:.1f}s / {e.limit:.0f}s")| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
max_runtime_seconds | float | No | None | Wall-clock seconds, measured from the first call, before the run is stopped. None disables the cap. |
ℹ️ Checked between calls
The duration cap is evaluated before each call, so it stops the next call rather than interrupting one already in flight. A single very long call still completes; the cap applies to the run as a whole.
Checking budget state #
# Current spend in the active period
guard.spent # e.g. 12.34
# Remaining budget
guard.remaining # e.g. 37.66
# Utilization as a fraction (0.0 to 1.0)
guard.engine.utilization # e.g. 0.2468Read the current budget state at any time through these properties.
What happens when the budget runs out #
When the budget is exhausted, the SDK follows a fixed sequence.
- The SDK raises
BudgetExceededErrorbefore the LLM call goes out. - You are never billed for a blocked call.
- The
on_killcallback fires if you configured one. - Every later
guard.create()call raises immediately. - Internal errors never block LLM calls. Only
BudgetExceededErrorpropagates.
from agentkavach.exceptions import BudgetExceededError
try:
response = guard.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
except BudgetExceededError as e:
print(f"Budget exhausted: {e}")
# e.spent, e.limit, e.period availableℹ️ In-memory checks
Budget checks run in memory before each call, with no network round trip.