CI integration
The tavora CLI has two flavours of CI gate:
| Gate | What it checks | When to run |
|---|---|---|
tavora deploy --dry-run | The local tavora/ folder validates against the platform (schema parse, model + index + secret refs resolve, agent-count tier limits). | Every PR that touches tavora/**. |
tavora rag-eval formats --gate | The RAG pipeline still accepts and indexes every supported document format. | After chunker / extractor / migration changes. |
tavora rag-eval judge --gate | LLM-judges RAG answers against a fixture invoice corpus. | On a schedule + after retrieval-side changes. |
Per-agent eval suites are advisory — they report a pass/fail
score but never block deploy. The Phase 12 promotion gate was
removed in the 2026-05-11 slim-down. Run them as a quality signal
either pre-deploy (tavora deploy --run-evals) or on demand
(tavora evals run <agent>); the gating job is the validate step,
not the eval step.
Validate-and-deploy: GitHub Actions
Section titled “Validate-and-deploy: GitHub Actions”A two-job workflow: validate on every PR, deploy on merge to main.
name: agent-deploy
on: pull_request: paths: - 'tavora/**' push: branches: [main] paths: - 'tavora/**'
jobs: validate: if: github.event_name == 'pull_request' runs-on: ubuntu-latest env: TAVORA_URL: https://api.tavora.ai TAVORA_API_KEY: ${{ secrets.TAVORA_STAGING_API_KEY }} steps: - uses: actions/checkout@v4 - name: Install tavora CLI run: | curl -fsSL https://github.com/tavora-ai/tavora-cli/releases/latest/download/tavora-linux-amd64.tar.gz \ | tar -xz -C /usr/local/bin tavora - name: Validate run: tavora deploy --dry-run
deploy: if: github.event_name == 'push' runs-on: ubuntu-latest env: TAVORA_URL: https://api.tavora.ai TAVORA_API_KEY: ${{ secrets.TAVORA_PRODUCTION_API_KEY }} steps: - uses: actions/checkout@v4 - name: Install tavora CLI run: | curl -fsSL https://github.com/tavora-ai/tavora-cli/releases/latest/download/tavora-linux-amd64.tar.gz \ | tar -xz -C /usr/local/bin tavora - name: Deploy run: tavora deploy --run-evals--dry-run validates locally + against the server, but persists
nothing — perfect for PR feedback. --run-evals on the deploy job
fires the per-agent pinned suite after publishing; eval failures
surface in the CI log but don’t roll back the deploy.
GitLab CI
Section titled “GitLab CI”stages: [validate, deploy]
validate: stage: validate image: golang:1.22 variables: TAVORA_URL: https://api.tavora.ai TAVORA_API_KEY: $TAVORA_STAGING_API_KEY script: - go install github.com/tavora-ai/tavora-cli/cmd/tavora@latest - tavora deploy --dry-run only: refs: [merge_requests] changes: ['tavora/**/*']
deploy: stage: deploy image: golang:1.22 variables: TAVORA_URL: https://api.tavora.ai TAVORA_API_KEY: $TAVORA_PRODUCTION_API_KEY script: - go install github.com/tavora-ai/tavora-cli/cmd/tavora@latest - tavora deploy --run-evals only: refs: [main] changes: ['tavora/**/*']RAG smoke tests
Section titled “RAG smoke tests”The two rag-eval gates ship pre-built and exercise the runtime
end-to-end. Useful on a nightly schedule even when you haven’t
changed anything — they catch silent regressions in the underlying
provider’s behaviour.
on: schedule: - cron: '0 6 * * *'
jobs: rag: runs-on: ubuntu-latest env: TAVORA_URL: https://api.tavora.ai TAVORA_API_KEY: ${{ secrets.TAVORA_API_KEY }} GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }} steps: - run: | curl -fsSL https://github.com/tavora-ai/tavora-cli/releases/latest/download/tavora-linux-amd64.tar.gz \ | tar -xz -C /usr/local/bin tavora - run: tavora rag-eval formats --gate - run: tavora rag-eval judge --gate --pass-threshold 7rag-eval judge provisions a temp store, uploads fixture invoices,
asks structured questions, scores via Gemini, then (by default)
cleans up. --cleanup=false preserves the store for forensic
inspection later.
Exit codes
Section titled “Exit codes”| Code | Meaning |
|---|---|
0 | Validation passed / deploy succeeded / all eval cases passed the threshold. |
1 | Validation failed (schema error, broken ref, fatal --dry-run issue) or at least one eval case failed. |
2 | Run did not complete (timeout, server error, missing env var). |
CI runners treat anything non-zero as failure; the 1 vs 2
distinction lets your post-step hooks differentiate “agent quality
regression” from “infra broke” if you care.
- Use separate projects for staging and production. Mint two API keys, point validate at staging and deploy at production. Eval runs against production will bill against production usage.
- Pin a judge model.
--judge gemini-2.5-flash(or whichever) so the judge’s score-rubric drift is observable, not silent. - Path-filter your jobs. PRs that don’t touch
tavora/**paying a deploy-validate tax is friction without signal. - Capture the run ID on failure.
tavora evals runs get <run-id>prints the per-case judge reasoning. Drop the URL into the failed-CI message so reviewers click straight through.
Next steps
Section titled “Next steps”- CLI reference — the full subcommand catalogue.
- Tutorial: Build the tasklist agent — the end-to-end code-first flow this CI integrates with.