Providers
AgentKavach supports OpenAI, Anthropic, Google, and Mistral. The same guard.create() method works for every provider.
OpenAI #
Standard call
from agentkavach import AgentKavach, Budget
guard = AgentKavach(
provider="openai",
api_key="ak_prod_...",
llm_key="sk-...",
budget=Budget.daily(50),
)
response = guard.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)Native namespace
response = guard.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)The native namespace mirrors the OpenAI client. Use it when you want the familiar shape.
Streaming
stream = guard.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a poem"}],
stream=True,
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")Anthropic #
Standard call
from agentkavach import AgentKavach, Budget
guard = AgentKavach(
provider="anthropic",
api_key="ak_prod_...",
llm_key="sk-ant-...",
budget=Budget.daily(50),
)
response = guard.create(
model="claude-sonnet-4-6",
messages=[{"role": "user", "content": "Summarize this report"}],
max_tokens=1024,
)Native namespace
response = guard.messages.create(
model="claude-sonnet-4-6",
messages=[{"role": "user", "content": "Summarize this report"}],
max_tokens=1024,
)⚠️ max_tokens is required
Anthropic requires max_tokens on every call. The provider rejects the request without it.
Streaming
Pass stream=True and iterate the result. Anthropic emits typed events; the SDK reads output tokens from the trailing message_delta so cost stays accurate, and records partial usage if you break out early.
stream = guard.create(
model="claude-sonnet-4-6",
messages=[{"role": "user", "content": "Explain HMAC."}],
max_tokens=1024,
stream=True,
)
for event in stream:
delta = getattr(event, "delta", None)
text = getattr(delta, "text", None) if delta is not None else None
if text:
print(text, end="", flush=True)Google #
Standard call
from agentkavach import AgentKavach, Budget
guard = AgentKavach(
provider="google",
api_key="ak_prod_...",
llm_key="AIza...",
budget=Budget.monthly(300),
)
response = guard.create(
model="gemini-2.5-flash",
contents="Generate a project outline",
)Native namespace
response = guard.generate_content(
model="gemini-2.5-flash",
contents="Generate a project outline",
)⚠️ contents, not messages
Google uses contents. Passing messages raises an error.
Streaming
Pass stream=True to get an iterator of chunks, each exposing .text for the incremental output. The final chunk carries the exact token counts the SDK uses for cost.
stream = guard.create(
model="gemini-2.5-flash",
contents="Write a haiku about TLS.",
stream=True,
)
for chunk in stream:
if chunk.text:
print(chunk.text, end="", flush=True)Mistral #
Standard call
from agentkavach import AgentKavach, Budget
guard = AgentKavach(
provider="mistral",
api_key="ak_prod_...",
llm_key="your-mistral-api-key",
budget=Budget.daily(50),
)
response = guard.create(
model="mistral-large-latest",
messages=[{"role": "user", "content": "Hello!"}],
)Native namespace
response = guard.chat.complete(
model="mistral-large-latest",
messages=[{"role": "user", "content": "Hello!"}],
)Mistral uses an OpenAI-compatible shape. The native namespace maps to client.chat.complete().
Streaming
stream = guard.create(
model="mistral-large-latest",
messages=[{"role": "user", "content": "Write a poem"}],
stream=True,
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")Token counting #
AgentKavach counts input tokens before each call so it can estimate cost. OpenAI and Mistral count locally with tiktoken, which is effectively free. Anthropic and Google count through a provider API call, which adds a network round trip.
| Provider | Method | Typical latency |
|---|---|---|
| OpenAI | Local (tiktoken) | ~0.1 ms |
| Anthropic | Provider API call | ~150 ms |
| Provider API call | ~150 ms | |
| Mistral | Local (tiktoken) | ~0.1 ms |
Cross-provider comparison #
| Feature | OpenAI | Anthropic | Mistral | |
|---|---|---|---|---|
| Parameter name | messages | messages | contents | messages |
| Native namespace | guard.chat.completions.create() | guard.messages.create() | guard.generate_content() | guard.chat.complete() |
| Unified API | guard.create() | guard.create() | guard.create() | guard.create() |
| Streaming | stream=True | stream=True | stream=True | stream=True |