Observe AI in Production
base drift otel events For: engineers & operators
Once the triage agent is live, you need to see what it is doing — cheaply and continuously. This guide layers Briefcase’s observability tools onto the same classify_ticket function: emit records, track spend, detect drift, trace, and alert.
pip install briefcase-ai[drift,otel,events]-
Emit every decision in one line
observe()wires up an exporter so captured decisions actually go somewhere. Use"console"in development, a.jsonlpath for log shipping, or"memory"in tests.import briefcasebriefcase.observe("decisions.jsonl") # append-only, thread-safe@briefcase.capture(decision_type="ticket-classification")def classify_ticket(text: str) -> str:# call your model herereturn "billing" -
Watch cost against a budget
CostCalculatorestimates per-call cost from token counts and checks spend against a limit. The cost types ship in the base package.from briefcase.cost import CostCalculatorcalc = CostCalculator()estimate = calc.estimate_cost("gpt-4o-mini", input_tokens=1000, output_tokens=200)print(estimate.total_cost, estimate.currency)budget = calc.check_budget(current_spend=85.0, budget_limit=100.0)print(budget.status, budget.alert_message) # e.g. "warning", "..." -
Measure drift across repeated runs
Sample the same decision over time and ask how consistent it stays. A falling
consistency_scoreis your signal that behavior is shifting.from briefcase.drift import DriftCalculatorcalc = DriftCalculator().with_similarity_threshold(0.9)metrics = calc.calculate_drift(["billing", "billing", "account", "billing"])print(metrics.consistency_score, metrics.agreement_rate)print(metrics.consensus_output, metrics.outliers) -
Trace alongside your existing telemetry
get_tracer()returns a standard OpenTelemetry tracer. Spans describe the timeline of work; decision records carry the governance context — they are complementary and both flow to your collectors.from briefcase.otel import get_tracertracer = get_tracer("briefcase")with tracer.start_as_current_span("classify_ticket"):classify_ticket("My invoice is wrong") -
Fire events when something looks off
Turn signals into action. The emit helpers are coroutines —
awaitthem inside an async context — and are ideal for low-confidence outputs or detected drift.import asynciofrom briefcase.events import emit_low_confidence, emit_drift_detectedasync def main():await emit_low_confidence({"id": "dec-1"}, confidence=0.4, threshold=0.7)await emit_drift_detected({"id": "dec-1"}, {"drift_score": 0.3})asyncio.run(main())