| Internet-Draft | AFiR SPICE Profile | June 2026 |
| Rotzin | Expires 14 December 2026 | [Page] |
This document defines AFiR (Attested Fragmented Inference Routing) as a production profile of the IETF SPICE Inference Chain specification [I-D.draft-mw-spice-inference-chain].¶
The SPICE Inference Chain defines computational provenance via two mechanisms: Zero-Knowledge Machine Learning (ZKML) proofs and Trusted Execution Environment (TEE) attestation quotes. Both require either significant proof generation latency (ZKML) or specialized hardware (TEE). Neither is deployable today in commodity serverless inference environments without infrastructure changes.¶
AFiR defines a third proof type -- post-quantum digital signature attestation using ML-DSA-65 (NIST FIPS 204) -- that is deployable on any inference platform, requires no specialized hardware, adds 0.785ms of overhead per fragment, and produces a 384-byte receipt anchored on a public blockchain. AFiR receipts are structurally compatible with the SPICE Inference Chain Merkle tree and can coexist with ZKML and TEE entries in the same session chain.¶
AFiR extends the SPICE inference chain with five concrete production primitives: Signed Tool Calls (P1), Cross-Agent Receipt Trees (P2), KV Cache Signing (P3), Model Manifest attestation (P4), and a Crypto-Agile Signature Layer (P5). All five are deployed and serving production traffic as of June 2026, making AFiR the first production implementation of the SPICE inference_root claim for multi-agent pipelines.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 3 December 2026.¶
Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The SPICE Inference Chain [I-D.draft-mw-spice-inference-chain] defines two proof types for computational provenance:¶
The practical effect is that the SPICE Inference Chain, as currently defined, cannot be adopted in commodity cloud environments (serverless functions, container-based inference runtimes, shared GPU pools) without either accepting ZKML latency incompatible with real-time serving, or deploying specialized hardware unavailable in most production inference clouds. This leaves the majority of production AI inference volume outside the scope of any SPICE-conformant inference attestation.¶
AFiR addresses this gap by defining a third proof type: post-quantum digital signature attestation using ML-DSA-65 (NIST FIPS 204 [FIPS204]).¶
A post-quantum signature attestation makes the following proof statement:¶
"Agent A, at timestamp T, signed a commitment over (input_hash, output_hash, model_id, tool_name, session_id) using ML-DSA-65 with key K. Key K is registered and publicly verifiable. The signature is unforgeable under standard lattice hardness assumptions (Module Learning With Errors, MLWE). A cryptographic receipt anchored on Base Mainnet via USDC provides a tamper-evident timestamp independent of any single party's infrastructure."¶
This proof type does not require:¶
AFiR is in production as of June 2026, operating on serverless infrastructure. All five primitives defined in this document are deployed, smoke-tested, and serving live traffic.¶
This document is a companion to, not a replacement of:¶
AFiR receipt entries are structurally compatible with the SPICE inference chain Merkle tree and MAY coexist with ZKML and TEE entries in the same session's inference chain.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
AFiR uses ML-DSA-65 as its primary signature algorithm. ML-DSA-65 is the NIST-standardized post-quantum digital signature algorithm (FIPS 204, August 2024), providing:¶
The signed message for each AFiR receipt is the SHA-256 hash of the canonical JSON serialization [RFC8785] of the receipt payload fields: input_hash, output_hash, model_id, model_fingerprint, tool_name (if applicable), session_id, iat, nullifier.¶
AFiR measured performance on commodity serverless infrastructure (2026):¶
These measurements are from production traffic and represent the overhead of the complete AFiR signing pipeline including on-chain anchoring.¶
AFiR anchors the Merkle root of each session's inference chain on Base Mainnet via a USDC transfer carrying the root hash as calldata. This provides:¶
The on-chain anchor does not contain individual receipt payloads. Per-entry proof retrieval uses the inference registry URI, following the same architecture as defined in [I-D.draft-mw-spice-inference-chain] Section 5.¶
AFiR entries include all REQUIRED common fields from
[I-D.draft-mw-spice-inference-chain]
Section 4.1. The entry type value is
afir_pq_signature.¶
The following is an example AFiR inference chain entry for a signed tool call (P1, before phase):¶
{
"type": "afir_pq_signature",
"sub": "spiffe://thehiveryiq.com/agent/orchestrator",
"model_fingerprint": "sha256:a3f9...",
"model_id": "claude-opus-4-20260401",
"input_hash": "sha256:b7c2...",
"output_hash": "sha256:d4e1...",
"intent_entry_ref": 2,
"iat": 1749780000,
"nullifier": "8a3f2c91b0e74d56a1f3c8b2e9d07f4a...",
"algorithm": "ML-DSA-65",
"public_key_hint": "79c1383bb1ba226d",
"phase": "before",
"receipt_chain":
"https://api.thehiveryiq.com/afir/receipts/sess-uuid-12345",
"on_chain_anchor": null,
"inference_digest": "sha256:f8a3...",
"inference_sig": "eyJhbGciOiJNTC1EU0EtNjUi..."
}
¶
AFiR ships five production primitives, each corresponding to a distinct layer of the AI inference stack.¶
Endpoints: POST /v1/afir/tool/sign and POST /v1/afir/tool/verify¶
P1 produces a before-and-after receipt for every MCP or Agent-to-Agent (A2A) tool invocation. The "before" receipt is produced before the tool executes, binding: tool_name, tool_version, input_hash, model_id, session_id, parent_receipt_nullifier, iat. The "after" receipt is produced after the tool returns, binding: output_hash, tool_exit_status, latency_ms, parent_receipt_nullifier (the nullifier of the "before" receipt), iat.¶
The nullifier chain from before to after ensures that a tool call receipt cannot be detached from its corresponding response receipt, and that replay of a valid before-receipt against a different tool response is detectable.¶
P1 directly addresses the unsigned tool invocation vulnerability class present in MCP deployments. The AFiR signing sidecar intercepts the call before the MCP transport layer, requiring no changes to MCP server implementations.¶
Endpoint: POST /v1/afir/tree/build¶
P2 implements the inference chain Merkle tree architecture defined in [I-D.draft-mw-spice-inference-chain] using AFiR receipt entries as leaf nodes. When Agent A calls Agent B which calls Agent C, P2 builds a Merkle tree across all receipts produced in the session. The root hash is the inference_root included in the OAuth token.¶
P2 is the AFiR reference implementation of the inference_root claim defined in [I-D.draft-mw-spice-inference-chain] Section 5.3. It is deployed and serving production traffic as of June 2026.¶
Endpoint: POST /v1/afir/cache/sign¶
P3 addresses a provenance gap not covered by the intent chain or the existing inference chain draft: the attestation of cached token prefixes served from distributed KV stores. In production agentic deployments using disaggregated prefill architectures, KV cache hit rates exceeding 90% have been measured. This means the majority of tokens served to the model in high-cache-hit deployments have no provenance attestation.¶
P3 signs each KV cache entry at write time and validates the signature at read time before cached tokens are injected into the model's context. If a cached prefix does not match its receipt on retrieval, the request MUST fail before the prefix is injected into the model's context.¶
Endpoints: POST /v1/afir/manifest/publish and GET /v1/afir/manifest/{nullifier}¶
P4 provides TEE-free attestation of which model, which weights, and which quantization configuration served a given request. A Model Manifest is a signed document binding: model_id, model_fingerprint (SHA-256 of model weights plus architecture), quantization, serving_runtime, infrastructure, iat, and nullifier.¶
The Model Manifest nullifier is included in all subsequent AFiR receipt entries produced during a session, creating a binding between every inference receipt and the specific model configuration that produced it.¶
P4 addresses the Model Masquerading attack class identified in [I-D.draft-mw-spice-inference-chain] Section 1.1 without requiring TEE hardware. The trust basis is the operator's key management rather than hardware isolation. P4 is therefore appropriate for environments where TEE is unavailable, with this distinction explicitly understood.¶
Endpoints: POST /v1/afir/sign and GET /v1/afir/algorithms¶
P5 implements a crypto-agile signing endpoint supporting multiple post-quantum and classical signature algorithms under a single API surface. The algorithm is specified per-request and recorded in the receipt entry, making receipts from different algorithm generations cross-verifiable via the Merkle structure.¶
| Algorithm | Status | Standard | Notes |
|---|---|---|---|
| ML-DSA-65 | Active | NIST FIPS 204 | Primary, post-quantum |
| ML-DSA-44 | Active | NIST FIPS 204 | Compact, post-quantum |
| Ed25519 | Active | RFC 8032 | Classical, transition support |
| SLH-DSA | Reserved | NIST FIPS 205 | Planned |
| FN-DSA | Reserved | NIST FIPS 206 | Planned |
Algorithm negotiation follows the same model as TLS cipher suite negotiation. When a customer needs to upgrade from ML-DSA-65 to a future algorithm, they change a single configuration field. Prior receipts remain verifiable under their original algorithm.¶
AFiR receipt entries are structurally compatible with the SPICE inference chain Merkle tree defined in [I-D.draft-mw-spice-inference-chain] Section 5.2. Leaf nodes are SHA-256 hashes of canonically serialized AFiR receipt entries (JSON Canonicalization Scheme [RFC8785]). The Merkle tree construction algorithm is identical to that defined in [I-D.draft-mw-spice-intent-chain] Section 5.3.¶
The resulting inference_root is included in the OAuth token
using the claim structure defined in
[I-D.draft-mw-spice-inference-chain]
Section 5.3, with inference_proof_type set to
afir_ml_dsa_65 (see Section 11).¶
A token carrying an AFiR inference chain follows the full Truth Stack structure defined in [I-D.draft-mw-spice-inference-chain] Section 6, with inference_proof_type set to an AFiR algorithm identifier:¶
{
"iss": "https://auth.example.com",
"sub": "user-alice",
"aud": "https://api.example.com",
"jti": "tok-afir-12345",
"sid": "sess-uuid-12345",
"iat": 1749780000,
"exp": 1749783600,
"actor_chain": [ "..." ],
"intent_root": "sha256:abc123...",
"intent_registry": "https://intent-log.example.com/...",
"inference_root": "sha256:xyz789...",
"inference_proof_type": "afir_ml_dsa_65",
"inference_registry":
"https://api.thehiveryiq.com/afir/receipts/sess-uuid-12345"
}
¶
AFiR extends the tiered verification strategy from [I-D.draft-mw-spice-inference-chain] Section 7.4:¶
| Risk Level | Actor Chain | Intent Chain | Inference Chain |
|---|---|---|---|
| Low | Sync | Skip | Skip |
| Medium | Sync | Cached proof | AFiR signature check (<1ms) |
| High | Sync | Full | AFiR + on-chain anchor (~7ms) |
| Critical | Sync | Full | AFiR + on-chain + ZKML/TEE |
AFiR entries and ZKML/TEE entries MAY coexist in the same inference chain. The SPICE Inference Chain Merkle tree is agnostic to the proof type of individual entries; the root hash covers all entries regardless of type. Verifiers MUST check the "type" field of each entry and apply the verification procedure appropriate to that type.¶
This is useful for deployments that use AFiR for real-time signing during inference and generate ZKML proofs asynchronously for high-value operations, or that run some agents on TEE-equipped hardware and others on commodity infrastructure.¶
ML-DSA-65 is secure under the hardness of the Module Learning With Errors (MLWE) problem, which is believed to be hard for both classical and quantum computers. NIST standardized ML-DSA-65 in FIPS 204 [FIPS204] (August 2024) following an eight-year public evaluation process. The security basis of AFiR signatures is mathematical (lattice hardness), not hardware-rooted. Both trust bases are valid; they are appropriate for different deployment contexts and threat models.¶
The Base Mainnet on-chain anchor provides tamper evidence independent of AFiR operator infrastructure. An adversary wishing to forge an AFiR receipt for a past session must either forge an ML-DSA-65 signature (computationally infeasible under MLWE hardness) or rewrite Base Mainnet history (computationally infeasible under proof-of-stake consensus). Neither is feasible under standard assumptions.¶
| Threat | ZKML | TEE | AFiR |
|---|---|---|---|
| Model substitution | Yes | Yes | P4 |
| Weight tampering | Yes | Yes | P4 |
| Environment spoofing | No | Yes | No* |
| Replay of stale proofs | Yes | Yes | Yes |
| Tool call repudiation | No | No | P1 |
| Cache poisoning | No | No | P3 |
| Cross-agent chain break | No | No | P2 |
| Output repudiation | Yes | Yes | Yes |
* AFiR does not provide hardware-rooted proof that inference ran inside an isolated enclave. For deployments requiring environment isolation proof, TEE entries SHOULD be used for the relevant chain segments, potentially coexisting with AFiR entries as described in Section 9.¶
AFiR signing keys MUST be generated as ML-DSA-65 key pairs per FIPS 204, stored in a key management system with access logging, rotated on a configurable schedule (90 days RECOMMENDED), and bound to a single operator identity per key pair. Public keys SHOULD be published in a discoverable registry to allow verifiers to retrieve the full public key given the public_key_hint in an AFiR receipt entry.¶
This document requests registration of the following inference_proof_type values for use with the inference_root claim defined in [I-D.draft-mw-spice-inference-chain]:¶
No new JWT claims are defined by this document. The existing inference_root, inference_proof_type, and inference_registry claims defined in [I-D.draft-mw-spice-inference-chain] are used without modification.¶