Skip to content
Docs for briefcase-ai v3.3.0see what’s new.

What's New in 3.3.0

briefcase-ai v3.3.0 launches oci-bai, an artifact graph for tracking every model, fine-tune, dataset, and runtime you push; the recent 3.2.x line made cost estimates match how you actually buy inference. Both are summarized below, newest first. The decision, replay, and cost APIs are fully backward compatible — every new parameter is keyword-only, so existing calls behave identically. See the full Changelog for details.

3.3.0 — oci-bai artifact graph

oci-bai tracks every image you push through an OCI-compatible gateway in a single artifact graph — with lineage, provenance, deduplication, and search built in. Push with any OCI tool (docker push, crane); the graph builds itself.

Terminal window
docker tag my-model:latest localhost:8080/my-repo:v1
docker push localhost:8080/my-repo:v1
oci-bai --repo my-repo log v1
oci-bai --repo my-repo diff base v1 --depth package
oci-bai search "format==safetensors cuda>=12.4"

Why it matters: every fine-tune and dataset version is tracked, searchable, and linked to its parent — no manual bookkeeping. Weight-sharing metrics tell you whether a push was a re-tag, a partial fine-tune, or a full retrain.

oci-bai is in private beta — contact support@briefcaseai.org to request access.

The full documentation lives at oci.briefcaseai.io. For an overview of the CLI and capabilities, see Artifact Graph & Evaluate.

3.2.x — Cost & pricing

Price any platform with rate cards

CostCalculator.estimate_cost takes an optional rate_card — a forgiving platform × tier × modifiers string — so an estimate reflects the platform and tier you actually run on, not just first-party list price.

from briefcase.cost import CostCalculator
calc = CostCalculator()
# First-party standard pricing (unchanged default)
standard = calc.estimate_cost("claude-opus-4-8", 500_000, 50_000)
# Same call, priced for the AWS Bedrock batch tier
batch = calc.estimate_cost("claude-opus-4-8", 500_000, 50_000, rate_card="bedrock:batch")
print(standard.total_cost, batch.total_cost)
print(calc.get_available_rate_cards())
PartValuesEffect
Platformfirst_party · bedrock · vertex · azureSelects the provider’s price sheet
Tierstandard · batch · cached · priority · flexbatch/flex ≈ 0.5×; priority is a premium
Modifiersregional · us · fastRegional/residency add ~10%

Why it matters: the same classify_ticket workload costs very differently on Bedrock batch versus first-party standard. Rate cards let you compare the real number before you ship a platform or tier change.

Prompt-cache billing

Anthropic prompt caching changes the math: cache reads are billed at a fraction of the input rate. estimate_cost accepts cache-token counts and exposes a cache_cost on the estimate.

estimate = calc.estimate_cost(
"claude-opus-4-8",
input_tokens=0,
output_tokens=1_000,
cache_read_tokens=100_000, # also: cache_write_5m_tokens, cache_write_1h_tokens
)
print(estimate.cache_cost, estimate.total_cost)

Why it matters: a cache-heavy agent’s bill is dominated by cache reads at 0.1× input — now your estimate reflects that instead of overcounting.

Latest model pricing

The default pricing table covers the current frontier: Anthropic Claude 4.x (claude-opus-4-8, claude-sonnet-4-6, claude-haiku-4-5, …), OpenAI GPT-5.x (gpt-5.5, gpt-5.4-mini, …), and Google Gemini (gemini-3.1-pro, gemini-2.5-flash, …). Every previously priced model is retained.

Why it matters: you can estimate and compare today’s models without hand-maintaining a price sheet.

Wider Python support

A single stable-ABI wheel per platform installs on Python 3.9–3.13, and the source distribution bundles its license files. Why it matters: pip install briefcase-ai works across more environments without building from source.

Explore