Headers common to every pathway (v0.4 floor)
Three header families apply uniformly across every pathway below
— not specific to text-tokens, MCP, or latents. They're the
v0.4 capability + enforcement floor, and the wire-compression
negotiation that v0.4.1 normalised across all 6 clients.
- Capability advertisement. Client sends
Codec-Client-Version: 0.4 on every request; the
server uses it to pick a graceful downgrade path. If the
deployment's floor is higher than the client can satisfy, the
server returns HTTP 426 with
Codec-Min-Version + Codec-Required-Features
(comma-separated feature names, e.g.
safety-policy-enforcement, mandatory-classifier).
- Compression negotiation. Client sends
Accept-Encoding: zstd, br, gzip, identity; the server
picks the smallest valid per the spec preference order
zstd > br > gzip > identity. v0.4.1 fixed
brotli's per-chunk-flush bug + landed dict-zstd decode across all
6 clients, so advertising the full menu is correct on every
binding now. When Content-Encoding: zstd, the response
also carries Codec-Zstd-Dict: sha256:<short> so
the client picks the right local dict.
- Safety-policy enforcement. Per-deployment opt-on
axis — see the "Safety policies (v0.4)" pathway below for
the full negotiation. When in force, the response carries
Codec-Safety-Policy-Id +
Codec-Safety-Policy-Hash and a server-side action
surfaces as finish_reason: "policy_violation" inside
the frame.
Full normative table: spec/versions/v0.4.md § Graceful downgrade;
request-vs-response split: Protocol » Request vs response.
Text-tokens (v0.2)
The original wire shape. Engines emit CodecFrames of
uint32 token IDs; the gateway forwards them; the client detokenizes
locally. The Codec-Tokenizer-Map: <id> sha256:<short>
response header identifies + hash-pins the active vocab so the
client can verify before decoding (mismatch is fail-fast).
The client may also resolve the map automatically from
(origin, <id>) via
/.well-known/codec/maps/ — see
discovery.
See codec-sglang,
codec-vllm,
codec-llamacpp.
MCP tool-calls (leaf-mode bypass)
The fast path on a Codec-aware gateway: the tool itself tokenizes its
result once, against a pinned tokenizer map, and attaches the IDs
via a per-block _meta['ai.codec/leaf-tokenization']
payload (older sibling-block form still accepted by the reader for
back-compat). The gateway recognizes the pre-tokenized output and
bypasses its back-compat shim — the [Codec][leaf]
log fires; [Codec][shim] warnings disappear.
Non-Codec-aware tools in the same namespace still work via the
shim path. Trade is a fixed ~210-byte _meta envelope
per text block (text crossover at ~300+ chars) for a 12.4×
consumer-side BPE-tokenize speedup — large MCP results win
on both axes.
What's MCP-specific in the v0.4 / v0.4.1 wire:
the gateway's Codec-Tokenizer-Map response header is
the trust anchor the receiver compares the leaf payload's
map_id against (mismatch → reject the result,
KV-cache stays clean). codec-metamcp ships a pre-trained
16 KB MCP-shaped zstd dict by default and
emits Codec-Zstd-Dict on every response so a
Codec-aware client can decompress tools/list +
tool-result bytes against it (~4.7× over JSON+gzip on real
MCP traffic). v0.4.1 hardens the leaf reader with hash-validation
before fetch (refuses a malformed mapHash instead of
surfacing it later as "fetch failed") and lands dict-zstd interop
across all 6 clients — the gateway can now safely rely on the
wire-compression headers being honored uniformly. The generic
v0.4 header floor above (capability / 426, compression negotiation,
safety-policy enforcement) applies to MCP traffic the same way it
applies to text-tokens and latents — not repeated here.
Two SDKs, two scopes for the tool-author side:
codec-leaf wraps existing
MCP servers (adds the _meta['ai.codec/leaf-tokenization']
payload on results); codec-tool-kit
is the SDK for authoring net-new Codec-native tools
as independently-hosted bolt-ons (manifest + build-time
precache + runtime hashtable-lookup, gateway stays a pure
token router). Both are valid; complementary in real deployments.
See codec-metamcp,
codec-leaf,
codec-tool-kit,
@codecai/mcp-leaf,
@codecai/tool-kit,
codec-time-leaf reference image.
Latents (v0.3)
Image and video diffusion models stream VAE latents instead of decoded
pixels. Wire shape: one LatentStreamHeader followed by
one or N LatentFrames per the negotiated pipeline (one
of seven: raw, int8, int4,
int8-adaptive, int4-adaptive,
delta+int8, delta+int4). The client runs
vae_decode locally; pixels never touch the wire.
See codec-comfyui,
codec-diffusers,
PIPELINES.md.
Safety policies (v0.4)
A sixth HELLO/READY capability axis. Clients announce
accept_safety_policies on HELLO; servers respond on READY
with safety_policy_id + safety_policy_hash.
The hash anchors to a sanitized descriptor served at
/.well-known/codec/policies/<id>.json (and the
content-addressed /sha256/<hex>.json sibling), which
publishes the shape of enforcement (categories, action types,
classifier family) but never operator-internal banned-token lists,
regex patterns, or classifier thresholds. finish_reason:
"policy_violation" surfaces when a server-side action fires.
Non-breaking against v0.3 — older clients ignore the new optional
fields and continue working.
See spec/versions/v0.4.md,
@codecai/web-safety (browser-side prefilter + classifier registry),
and codec-supervisor (server-side enforcement + admin app).