Budgets

A budget is a ceiling that AgentKavach enforces in memory before every call. When the ceiling is reached, the agent is terminated and subsequent calls are rejected. AgentKavach enforces three budget dimensions:

  • Cost. A spending limit in US dollars, available over three periods: daily (resets at midnight UTC), monthly (resets on the first of the month), and total (never resets). Cost is also poolable across every agent in an organization with Budget.org_budget(...).
  • Token count. A limit on the total tokens an agent may consume, set with max_tokens_per_run.
  • Duration. A limit on how long an agent may run, set with max_runtime_seconds.

Each dimension is enforced independently. The first ceiling an agent reaches terminates it. The sections below document each dimension in turn.

Dimensions #

Threshold alerts evaluate each dimension independently.

DimensionUnitConfigured via
costUSDbudget=Budget.daily / .monthly / .total
tokens_totaltokensmax_tokens_per_run=...
durationmillisecondsmax_runtime_seconds=...

ℹ️ Per-dimension alerts

Bind a channel to a dimension by setting budget_type on the ChannelConfig. The channel fires only when usage on that dimension crosses the threshold.

Budget.daily(limit) #

python
from agentkavach import AgentKavach, Budget

guard = AgentKavach(
    provider="openai",
    api_key="ak_prod_...",
    llm_key="sk-...",
    budget=Budget.daily(50),       # $50/day
)

response = guard.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

print(f"Spent: ${guard.spent:.4f}")
print(f"Remaining: ${guard.remaining:.4f}")
print(f"Utilization: {guard.engine.utilization:.1%}")

A daily budget resets at midnight UTC. Use it to cap per-day spend.

ParameterTypeRequiredDefaultDescription
limitfloatYesUSD limit per day. Must be > 0.

Budget.monthly(limit) #

python
guard = AgentKavach(
    provider="anthropic",
    api_key="ak_prod_...",
    llm_key="sk-ant-...",
    budget=Budget.monthly(500),     # $500/month
)

A monthly budget resets on the 1st at midnight UTC. Match it to your cloud billing cycle.

ParameterTypeRequiredDefaultDescription
limitfloatYesUSD limit per month. Must be > 0.

Budget.total(limit) #

python
guard = AgentKavach(
    provider="google",
    api_key="ak_prod_...",
    llm_key="AIza...",
    budget=Budget.total(1000),      # $1,000 lifetime cap
)

A total budget never resets. Once the limit is reached, the agent stays blocked until you raise the limit.

ParameterTypeRequiredDefaultDescription
limitfloatYesUSD lifetime limit. Must be > 0. Never resets.

Budget.org_budget(limit, period) #

python
from agentkavach import AgentKavach, Budget

# Per-agent cap: $10/day
# Org-wide cap: $50/day across every agent
org_pool = Budget.org_budget(limit=50, period="daily")

guard = AgentKavach(
    provider="openai",
    api_key="ak_prod_...",
    llm_key="sk-...",
    agent_name="research-bot",
    budget=Budget.daily(10),        # per-agent
    org_budget=org_pool,            # shared
)

response = guard.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

print(f"Agent spent: ${guard.spent:.4f}")
print(f"Agent remaining: ${guard.remaining:.4f}")

An org budget applies across every agent in your org. Each call counts toward the shared pool. When the org limit is reached, every agent stops.

Org budgets coexist with per-agent budgets. The agent has its own limit and shares the pool. The most restrictive limit wins.

ParameterTypeRequiredDefaultDescription
limitfloatYesUSD limit for the entire organization. Must be > 0.
periodstr | PeriodNo"daily"Budget period: "daily", "monthly", or "total". Defaults to "daily".

ℹ️ Most restrictive wins

When both budgets are set, the agent stops as soon as either limit is reached. An agent with a $10/day budget stops at $10 even if the org pool has room. If the org pool hits $50, every agent stops even if their individual budgets have room.

Individual vs. org budgets #

Two levels of enforcement work independently or together.

FeatureIndividual budgetOrg budget
ScopeSingle agentAll agents in the org
Set via SDKbudget=Budget.daily(10)org_budget=Budget.org_budget(50)
Set via YAMLagents.<name>.budget.dailyorg_budget.limit / org_budget.period
Set via SDK / YAMLBudget.daily(...) / agents.<name>.budgetBudget.org_budget(...) / org_budget:
Set via APIPOST /v1/agents/<name>/budgetsPOST /v1/org/budgets
EnforcementBlocks this agentBlocks every agent in the org
Ingest rejection429 budget_exceeded429 org_budget_exceeded

Example: Both budgets together

python
from agentkavach import AgentKavach, Budget

# Org-wide cap: $50/day across all agents
org = Budget.org_budget(limit=50, period="daily")

research = AgentKavach(
    provider="openai",
    api_key="ak_prod_...",
    llm_key="sk-...",
    agent_name="research-bot",
    budget=Budget.daily(20),    # individual
    org_budget=org,             # shared
)

support = AgentKavach(
    provider="anthropic",
    api_key="ak_prod_...",
    llm_key="sk-ant-...",
    agent_name="support-bot",
    budget=Budget.daily(15),    # individual
    org_budget=org,             # shared
)

# research-bot stops at $20
# support-bot stops at $15
# Both stop if combined org spend hits $50

Token cap (per run) #

A cost budget caps dollars over a period; a token cap limits the total tokens a single run may consume. Set max_tokens_per_run on the client and AgentKavach sums input and output tokens across every call in the run. When the running total would exceed the cap, the SDK raises TokenLimitError before the next call goes out. It is the safety net for a prompt that balloons or a loop that keeps appending context.

python
from agentkavach import AgentKavach, Budget
from agentkavach.exceptions import TokenLimitError

guard = AgentKavach(
    provider="openai",
    api_key="ak_prod_...",
    llm_key="sk-...",
    budget=Budget.daily(50),
    max_tokens_per_run=100_000,     # stop the run at 100k total tokens
)

try:
    response = guard.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Summarize this very long document..."}],
    )
except TokenLimitError as e:
    print(f"Token cap hit: {e.spent}/{e.limit} tokens")
ParameterTypeRequiredDefaultDescription
max_tokens_per_runintNoNoneTotal input + output tokens allowed in a single run. None disables the cap.

ℹ️ Per run, not per period

The token cap counts tokens for the life of one AgentKavach instance, not on a daily or monthly clock. It is a guardrail against a single runaway run, complementary to the cost budget rather than a replacement for it.

Duration cap (per run) #

A duration cap limits how long a run may keep making calls. Set max_runtime_seconds on the client; AgentKavach measures wall-clock time from the first call, and once the elapsed time crosses the cap it raises RuntimeLimitError instead of starting another call. Use it to stop an agent that is stuck retrying or working far longer than a task should take.

python
from agentkavach import AgentKavach, Budget
from agentkavach.exceptions import RuntimeLimitError

guard = AgentKavach(
    provider="openai",
    api_key="ak_prod_...",
    llm_key="sk-...",
    budget=Budget.daily(50),
    max_runtime_seconds=120,        # stop the run after 2 minutes of calls
)

try:
    while True:
        guard.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": "next step"}],
        )
except RuntimeLimitError as e:
    print(f"Runtime cap hit: {e.elapsed:.1f}s / {e.limit:.0f}s")
ParameterTypeRequiredDefaultDescription
max_runtime_secondsfloatNoNoneWall-clock seconds, measured from the first call, before the run is stopped. None disables the cap.

ℹ️ Checked between calls

The duration cap is evaluated before each call, so it stops the next call rather than interrupting one already in flight. A single very long call still completes; the cap applies to the run as a whole.

Checking budget state #

python
# Current spend in the active period
guard.spent          # e.g. 12.34

# Remaining budget
guard.remaining      # e.g. 37.66

# Utilization as a fraction (0.0 to 1.0)
guard.engine.utilization  # e.g. 0.2468

Read the current budget state at any time through these properties.

What happens when the budget runs out #

When the budget is exhausted, the SDK follows a fixed sequence.

  1. The SDK raises BudgetExceededError before the LLM call goes out.
  2. You are never billed for a blocked call.
  3. The on_kill callback fires if you configured one.
  4. Every later guard.create() call raises immediately.
  5. Internal errors never block LLM calls. Only BudgetExceededError propagates.
python
from agentkavach.exceptions import BudgetExceededError

try:
    response = guard.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello!"}],
    )
except BudgetExceededError as e:
    print(f"Budget exhausted: {e}")
    # e.spent, e.limit, e.period available

ℹ️ In-memory checks

Budget checks run in memory before each call, with no network round trip.