Features

The runtime is the part
you don't want to build twice.

Three pillars. Each is one thing demos handle and production doesn't. We built the production half so your team can ship the agentic experience instead.

01 Pillar

Knows your data

Demo

Answers from public docs.

Production

Reads tenant data through your APIs, scoped to the customer's org. Never crosses boundaries.

tenant isolation · live ● enforced

// agent runs for tenant_id = "acme"

const rows = await sql("select * from invoices");

// ← runtime rewrites to:

select * from invoices where tenant_id = 'acme'

✓ scoped

4 rows returned · all from acme

0 cross-tenant leaks possible

Primitives

→ Tenant scoping

Every call carries a tenant ID. The runtime enforces isolation at the API layer — even if the agent goes off-script.

→ Per-customer keys

BYO model keys. Customers can bring their own OpenAI / Anthropic / Bedrock account; we never see the keys.

→ Schema introspection

The agent reads your customer's schema at runtime — column names, types, foreign keys — to write queries that actually work.

→ Read-time policy

Row-level filters applied before the agent sees data. Even an "all rows" query returns the tenant's rows only.

02 Pillar

Acts in your product

Demo

Returns a string.

Production

Issues refunds, updates records, opens tickets — through your APIs, with approval flows for high-stakes actions.

approval queue · paused

awaiting human approval

The agent wants to issue a $847 refund to customer_4f2a.

reason: order #38291 never shipped
policy: §4.2 (over $500 → human review)
tenant: acme · invoked by: agent.refund-flow

Primitives

→ Tool definitions

Declare your APIs as tools. Tavora handles the calling loop, retries, and structured-output parsing.

→ Approval gates

Any tool can require human approval over a threshold. The agent pauses; an operator confirms; the run continues.

→ Idempotency

Every tool call carries a key. Retried calls don't double-charge, double-refund, or double-create.

→ Sandboxed code execution

When the model writes JavaScript instead of calling tools, it runs in a hardened isolate with no network and a memory cap.

03 Pillar

Judgable, not vibes

Demo

Looks right.

Production

Shows the JS it ran, the tools it called, the data it touched. Replay any turn. Eval-gated before deploy.

replay · turn_19f8e2 ← step → · scrub

0.0s planner classify intent

0.4s tool getCustomer(id: "4f2a")

0.9s tool getOrders(customer: "4f2a", limit: 5)

1.3s code computeRefundEligibility(orders, policy)

1.7s tool requestApproval(amount: 847)

— paused awaiting human approval

Primitives

→ Replay any turn

Click a production turn; see the exact plan, tool calls, model outputs, and data the agent saw. Step through it.

→ Eval suites

Define eval cases in Studio. Run them in CI. Block deploys when accuracy drops below the bar you set.

→ Drift detection

Sample production traffic, run it through your evals, alert when the live distribution drifts from the eval set.

→ Audit log per tenant

Every plan, tool call, and data access is logged per-tenant. Searchable. Exportable. SOC 2 ready.

Also in the box

Everything else you'd build
before launch.

The boring infrastructure underneath. None of it is interesting on its own. All of it is required to put an agent in front of customers.

Streaming + cancellation

Token-by-token streaming with proper cancellation. Stop a 30-second run mid-flight without leaking tool calls.

Rate limiting per tenant

Cap usage per customer org so a single tenant can't run up your bill or starve another.

Multi-model routing

Route fast paths to small models, hard cases to big ones. Or pin a customer to their preferred provider.

Structured outputs

Schema-validated JSON responses with retries on parse failure. The model gets the diff and tries again.

Memory & context

Per-user, per-tenant, per-conversation memory. Pluggable backends (Postgres, Redis, your own).

SDK in Go & TypeScript

First-class clients for both. Same surface area, same primitives, same docs.

Self-host or managed

Run Tavora in our cloud or yours. Same binary. Same Studio. Bring-your-own-VPC available.

OpenTelemetry traces

Every plan, tool call, and model invocation emits OTel spans. Drop them into your existing observability stack.

CLI + Terraform

Define agents as code. Version them. Diff them. Roll back. The way you ship the rest of your infra.

Read the docs, or just open Studio.

The fastest way to understand the runtime is to define an agent in Studio and click Run.

Open Studio Read the docs →

The runtime is the part you don't want to build twice.

Knows your data

Acts in your product

Judgable, not vibes

Everything else you'd buildbefore launch.

Read the docs, or just open Studio.

The runtime is the part
you don't want to build twice.

Everything else you'd build
before launch.