Protocol map

Codec runs on one client/gateway/engine triangle. The wire frame, the per-modality map, and the response headers shift per pathway — the triangle does not. The diagram below is the at-a-glance view; the normative spec is in PROTOCOL.md and pipeline math is in PIPELINES.md.

Codec protocol map showing three pathways: text-tokens (v0.2), MCP tool-calls with leaf-mode bypass, and v0.3 latents. Each pathway shows the client → gateway → engine flow with the wire-level headers and frame types annotated.
Source diagram lives in PROTOCOL.md as Mermaid blocks; the polished SVG above is a manual mirror that gets updated alongside the spec.

Headers common to every pathway (v0.4 floor)

Three header families apply uniformly across every pathway below — not specific to text-tokens, MCP, or latents. They're the v0.4 capability + enforcement floor, and the wire-compression negotiation that v0.4.1 normalised across all 6 clients.

  • Capability advertisement. Client sends Codec-Client-Version: 0.4 on every request; the server uses it to pick a graceful downgrade path. If the deployment's floor is higher than the client can satisfy, the server returns HTTP 426 with Codec-Min-Version + Codec-Required-Features (comma-separated feature names, e.g. safety-policy-enforcement, mandatory-classifier).
  • Compression negotiation. Client sends Accept-Encoding: zstd, br, gzip, identity; the server picks the smallest valid per the spec preference order zstd > br > gzip > identity. v0.4.1 fixed brotli's per-chunk-flush bug + landed dict-zstd decode across all 6 clients, so advertising the full menu is correct on every binding now. When Content-Encoding: zstd, the response also carries Codec-Zstd-Dict: sha256:<short> so the client picks the right local dict.
  • Safety-policy enforcement. Per-deployment opt-on axis — see the "Safety policies (v0.4)" pathway below for the full negotiation. When in force, the response carries Codec-Safety-Policy-Id + Codec-Safety-Policy-Hash and a server-side action surfaces as finish_reason: "policy_violation" inside the frame.

Full normative table: spec/versions/v0.4.md § Graceful downgrade; request-vs-response split: Protocol » Request vs response.

Text-tokens (v0.2)

The original wire shape. Engines emit CodecFrames of uint32 token IDs; the gateway forwards them; the client detokenizes locally. The Codec-Tokenizer-Map: <id> sha256:<short> response header identifies + hash-pins the active vocab so the client can verify before decoding (mismatch is fail-fast). The client may also resolve the map automatically from (origin, <id>) via /.well-known/codec/maps/ — see discovery.

See codec-sglang, codec-vllm, codec-llamacpp.

MCP tool-calls (leaf-mode bypass)

The fast path on a Codec-aware gateway: the tool itself tokenizes its result once, against a pinned tokenizer map, and attaches the IDs via a per-block _meta['ai.codec/leaf-tokenization'] payload (older sibling-block form still accepted by the reader for back-compat). The gateway recognizes the pre-tokenized output and bypasses its back-compat shim — the [Codec][leaf] log fires; [Codec][shim] warnings disappear. Non-Codec-aware tools in the same namespace still work via the shim path. Trade is a fixed ~210-byte _meta envelope per text block (text crossover at ~300+ chars) for a 12.4× consumer-side BPE-tokenize speedup — large MCP results win on both axes.

What's MCP-specific in the v0.4 / v0.4.1 wire: the gateway's Codec-Tokenizer-Map response header is the trust anchor the receiver compares the leaf payload's map_id against (mismatch → reject the result, KV-cache stays clean). codec-metamcp ships a pre-trained 16 KB MCP-shaped zstd dict by default and emits Codec-Zstd-Dict on every response so a Codec-aware client can decompress tools/list + tool-result bytes against it (~4.7× over JSON+gzip on real MCP traffic). v0.4.1 hardens the leaf reader with hash-validation before fetch (refuses a malformed mapHash instead of surfacing it later as "fetch failed") and lands dict-zstd interop across all 6 clients — the gateway can now safely rely on the wire-compression headers being honored uniformly. The generic v0.4 header floor above (capability / 426, compression negotiation, safety-policy enforcement) applies to MCP traffic the same way it applies to text-tokens and latents — not repeated here.

Two SDKs, two scopes for the tool-author side: codec-leaf wraps existing MCP servers (adds the _meta['ai.codec/leaf-tokenization'] payload on results); codec-tool-kit is the SDK for authoring net-new Codec-native tools as independently-hosted bolt-ons (manifest + build-time precache + runtime hashtable-lookup, gateway stays a pure token router). Both are valid; complementary in real deployments.

See codec-metamcp, codec-leaf, codec-tool-kit, @codecai/mcp-leaf, @codecai/tool-kit, codec-time-leaf reference image.

Latents (v0.3)

Image and video diffusion models stream VAE latents instead of decoded pixels. Wire shape: one LatentStreamHeader followed by one or N LatentFrames per the negotiated pipeline (one of seven: raw, int8, int4, int8-adaptive, int4-adaptive, delta+int8, delta+int4). The client runs vae_decode locally; pixels never touch the wire.

See codec-comfyui, codec-diffusers, PIPELINES.md.

Safety policies (v0.4)

A sixth HELLO/READY capability axis. Clients announce accept_safety_policies on HELLO; servers respond on READY with safety_policy_id + safety_policy_hash. The hash anchors to a sanitized descriptor served at /.well-known/codec/policies/<id>.json (and the content-addressed /sha256/<hex>.json sibling), which publishes the shape of enforcement (categories, action types, classifier family) but never operator-internal banned-token lists, regex patterns, or classifier thresholds. finish_reason: "policy_violation" surfaces when a server-side action fires. Non-breaking against v0.3 — older clients ignore the new optional fields and continue working.

See spec/versions/v0.4.md, @codecai/web-safety (browser-side prefilter + classifier registry), and codec-supervisor (server-side enforcement + admin app).