Why Briefcase

AI systems don’t just produce text — they make decisions that trigger actions: routing a support ticket, approving a request, choosing a tool, escalating to a human. When one of those decisions is wrong, “the model did it” is not an answer anyone can act on.

Briefcase is infrastructure for governing those decisions. It sits around the decision points in your application and gives you three things that are otherwise impossible to reconstruct after the fact:

Controls before action

Evaluate whether an action is allowed before it runs — deny-by-default, composable, and side-effect-free.

Full context, captured

Every decision is recorded with its inputs, outputs, model parameters, evidence, and the data it depended on.

A record you can verify

Replay decisions, reconstruct exactly what was known at the time, and seal it into a tamper-evident bundle.

The questions Briefcase lets you answer

When a decision is challenged — by a teammate, an incident review, or a customer — you need to answer, precisely and after the fact:

What did the system decide, and what did it see? The inputs, outputs, and confidence behind the call.
What rule governed it? The exact policy version that was in effect at the decision’s moment — not today’s policy.
Did the controls run first? Proof that a guardrail evaluated the action before anything happened.
What did we know at the time? The evidence and external data as they were then — corrections appended, never overwritten.
Can we reproduce it? A deterministic replay that compares the original output against a fresh run.

How it works: five acts

Briefcase organizes around the lifecycle of a single decision. The rest of these docs follow the same five acts, and a single running example threads through all of them: a support-ticket triage agent. Each ticket it handles produces two decisions you’ll see throughout — it classifies the ticket (the classify_ticket call in most examples) and routes it to a queue. Both are decisions Briefcase captures, governs, and can replay.

graph LR
    A["Capture<br/>record inputs, outputs,<br/>context, evidence"] --> B["Control<br/>enforce guardrails &<br/>versioned policy"]
    B --> C["Store & Query<br/>durable, append-only,<br/>queryable trail"]
    C --> D["Replay & Verify<br/>re-run, compare,<br/>detect drift"]
    D --> E["Prove<br/>reconstruct as-of &<br/>seal an audit bundle"]

Act	What you do	Key building blocks
Capture	Record every decision with full context	`@capture`, `DecisionSnapshot`, exporters, PII sanitization
Control	Enforce controls before the action runs	Guardrails, routing, versioned routing policy, validation
Store & Query	Keep a durable, queryable, append-only trail	Storage adapters, bitemporal storage, external data, RAG versioning
Replay & Verify	Re-run and check decisions hold up	Deterministic replay, drift detection, audit bundles
Prove	Reconstruct and verify after the fact	As-of reconstruction, `ExaminerBundle`

Who Briefcase is for

I use an AI coding assistant Let your AI editor add Briefcase for you — point it at the docs or give it the MCP tools. Start with AI-Assisted Setup.

Engineers Instrument a decision point in minutes, send records anywhere, and replay to catch regressions. Start with the Quickstart.

Platform & governance leads Define controls that run before actions, route through versioned policies, and prove which rule was in effect. Start with Guardrails.

Reproducibility & audit reviewers Reconstruct past decisions exactly and verify a sealed, tamper-evident record. Start with Audit Bundles.

Where it runs

Briefcase is an open-source Python SDK (with a Rust core) that wraps the decision points in code you already have. It is independent of model, vendor, and framework: bring your own LLM calls and storage. The base package is pip install briefcase-ai; optional capabilities are installed as extras.

Next steps

Quickstart Record, persist, and replay your first decision in about 5 minutes.

Core Concepts The object model behind every decision: snapshots, inputs, outputs, evidence.

Audit a Decision End-to-End Follow one decision from capture all the way to a verifiable sealed record.