What's New in 3.3.0
briefcase-ai v3.3.0 launches oci-bai, an artifact graph for tracking every model,
fine-tune, dataset, and runtime you push; the recent 3.2.x line made cost estimates match
how you actually buy inference. Both are summarized below, newest first. The decision, replay,
and cost APIs are fully backward compatible — every new parameter is keyword-only, so existing
calls behave identically. See the full Changelog for details.
3.3.0 — oci-bai artifact graph
oci-bai tracks every image you push through an OCI-compatible gateway in a single artifact
graph — with lineage, provenance, deduplication, and search built in. Push with any OCI tool
(docker push, crane); the graph builds itself.
docker tag my-model:latest localhost:8080/my-repo:v1docker push localhost:8080/my-repo:v1
oci-bai --repo my-repo log v1oci-bai --repo my-repo diff base v1 --depth packageoci-bai search "format==safetensors cuda>=12.4"Why it matters: every fine-tune and dataset version is tracked, searchable, and linked to its parent — no manual bookkeeping. Weight-sharing metrics tell you whether a push was a re-tag, a partial fine-tune, or a full retrain.
oci-bai is in private beta — contact support@briefcaseai.org to request access.
The full documentation lives at oci.briefcaseai.io. For an overview of the CLI and capabilities, see Artifact Graph & Evaluate.
3.2.x — Cost & pricing
Price any platform with rate cards
CostCalculator.estimate_cost takes an optional rate_card — a forgiving
platform × tier × modifiers string — so an estimate reflects the platform and
tier you actually run on, not just first-party list price.
from briefcase.cost import CostCalculator
calc = CostCalculator()
# First-party standard pricing (unchanged default)standard = calc.estimate_cost("claude-opus-4-8", 500_000, 50_000)
# Same call, priced for the AWS Bedrock batch tierbatch = calc.estimate_cost("claude-opus-4-8", 500_000, 50_000, rate_card="bedrock:batch")
print(standard.total_cost, batch.total_cost)print(calc.get_available_rate_cards())| Part | Values | Effect |
|---|---|---|
| Platform | first_party · bedrock · vertex · azure | Selects the provider’s price sheet |
| Tier | standard · batch · cached · priority · flex | batch/flex ≈ 0.5×; priority is a premium |
| Modifiers | regional · us · fast | Regional/residency add ~10% |
Why it matters: the same classify_ticket workload costs very differently on
Bedrock batch versus first-party standard. Rate cards let you compare the real
number before you ship a platform or tier change.
Prompt-cache billing
Anthropic prompt caching changes the math: cache reads are billed at a fraction
of the input rate. estimate_cost accepts cache-token counts and exposes a
cache_cost on the estimate.
estimate = calc.estimate_cost( "claude-opus-4-8", input_tokens=0, output_tokens=1_000, cache_read_tokens=100_000, # also: cache_write_5m_tokens, cache_write_1h_tokens)print(estimate.cache_cost, estimate.total_cost)Why it matters: a cache-heavy agent’s bill is dominated by cache reads at 0.1× input — now your estimate reflects that instead of overcounting.
Latest model pricing
The default pricing table covers the current frontier: Anthropic Claude 4.x
(claude-opus-4-8, claude-sonnet-4-6, claude-haiku-4-5, …), OpenAI GPT-5.x
(gpt-5.5, gpt-5.4-mini, …), and Google Gemini (gemini-3.1-pro,
gemini-2.5-flash, …). Every previously priced model is retained.
Why it matters: you can estimate and compare today’s models without hand-maintaining a price sheet.
Wider Python support
A single stable-ABI wheel per platform installs on Python 3.9–3.13, and
the source distribution bundles its license files. Why it matters: pip install briefcase-ai works across more environments without building from source.