Internet-Draft AFiR SPICE Profile June 2026
Rotzin Expires 14 December 2026 [Page]
Workgroup:
SPICE
Internet-Draft:
draft-rotzin-spice-afir-profile-00
Published:
Intended Status:
Informational
Expires:
Author:
S. Rotzin
Hive / AFiR

AFiR: Post-Quantum Signed Inference Receipts as a TEE-Free Profile for IETF SPICE Inference Chain

Abstract

This document defines AFiR (Attested Fragmented Inference Routing) as a production profile of the IETF SPICE Inference Chain specification [I-D.draft-mw-spice-inference-chain].

The SPICE Inference Chain defines computational provenance via two mechanisms: Zero-Knowledge Machine Learning (ZKML) proofs and Trusted Execution Environment (TEE) attestation quotes. Both require either significant proof generation latency (ZKML) or specialized hardware (TEE). Neither is deployable today in commodity serverless inference environments without infrastructure changes.

AFiR defines a third proof type -- post-quantum digital signature attestation using ML-DSA-65 (NIST FIPS 204) -- that is deployable on any inference platform, requires no specialized hardware, adds 0.785ms of overhead per fragment, and produces a 384-byte receipt anchored on a public blockchain. AFiR receipts are structurally compatible with the SPICE Inference Chain Merkle tree and can coexist with ZKML and TEE entries in the same session chain.

AFiR extends the SPICE inference chain with five concrete production primitives: Signed Tool Calls (P1), Cross-Agent Receipt Trees (P2), KV Cache Signing (P3), Model Manifest attestation (P4), and a Crypto-Agile Signature Layer (P5). All five are deployed and serving production traffic as of June 2026, making AFiR the first production implementation of the SPICE inference_root claim for multi-agent pipelines.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 3 December 2026.

Table of Contents

1. Introduction

1.1. The Deployment Gap in the SPICE Inference Chain

The SPICE Inference Chain [I-D.draft-mw-spice-inference-chain] defines two proof types for computational provenance:

  • ZKML proofs: mathematically certain, but proof generation takes minutes to hours per inference and is currently limited to models of approximately 100 million parameters or fewer.
  • TEE attestation: production-scale and real-time, but requires specific hardware (Intel TDX, AMD SEV-SNP, NVIDIA H100 Confidential Computing) and manufacturer PKI dependencies. Most serverless inference environments do not expose TEE primitives to the application layer.

The practical effect is that the SPICE Inference Chain, as currently defined, cannot be adopted in commodity cloud environments (serverless functions, container-based inference runtimes, shared GPU pools) without either accepting ZKML latency incompatible with real-time serving, or deploying specialized hardware unavailable in most production inference clouds. This leaves the majority of production AI inference volume outside the scope of any SPICE-conformant inference attestation.

1.2. AFiR Approach

AFiR addresses this gap by defining a third proof type: post-quantum digital signature attestation using ML-DSA-65 (NIST FIPS 204 [FIPS204]).

A post-quantum signature attestation makes the following proof statement:

"Agent A, at timestamp T, signed a commitment over (input_hash, output_hash, model_id, tool_name, session_id) using ML-DSA-65 with key K. Key K is registered and publicly verifiable. The signature is unforgeable under standard lattice hardness assumptions (Module Learning With Errors, MLWE). A cryptographic receipt anchored on Base Mainnet via USDC provides a tamper-evident timestamp independent of any single party's infrastructure."

This proof type does not require:

  • Specialized hardware (no TEE, no GPU confidential compute)
  • Proof generation delay (signing is 0.785ms per fragment)
  • Trust in a hardware manufacturer's PKI
  • Any changes to the inference runtime or model serving stack

AFiR is in production as of June 2026, operating on serverless infrastructure. All five primitives defined in this document are deployed, smoke-tested, and serving live traffic.

1.3. Relationship to Existing SPICE Drafts

This document is a companion to, not a replacement of:

AFiR receipt entries are structurally compatible with the SPICE inference chain Merkle tree and MAY coexist with ZKML and TEE entries in the same session's inference chain.

2. Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

AFiR Receipt:
A signed record produced by the AFiR signing layer before an inference output propagates to the next stage. Contains input commitment, output commitment, model identity, timestamp, nullifier, and a post-quantum digital signature.
Nullifier:
A unique, non-reusable identifier bound to each AFiR receipt, preventing replay of a valid receipt against a different output.
On-Chain Anchor:
A transaction on Base Mainnet containing the Merkle root of a session's inference chain, providing a tamper-evident timestamp independent of any single operator's infrastructure.
ML-DSA-65:
Module Lattice-based Digital Signature Algorithm, security parameter set 65, as defined in NIST FIPS 204 [FIPS204]. Post-quantum secure under MLWE hardness assumptions.
Fragment:
The smallest unit of inference output for which an AFiR receipt is produced. In streaming inference, a fragment is a single generation step. In non-streaming inference, a fragment is the complete response.
KV Cache Prefix:
The cached key-value state from prior turns in a multi-turn conversation or agentic session, reused by the inference engine to avoid recomputing attention over prior tokens.

3. The AFiR Proof Type

3.1. Algorithm: ML-DSA-65 (NIST FIPS 204)

AFiR uses ML-DSA-65 as its primary signature algorithm. ML-DSA-65 is the NIST-standardized post-quantum digital signature algorithm (FIPS 204, August 2024), providing:

  • Security level: NIST Level 3 (approximately 128-bit classical security, quantum-secure under MLWE)
  • Signature size: 3309 bytes
  • Public key size: 1952 bytes
  • Signing time: under 1ms on commodity hardware
  • Verification time: under 1ms on commodity hardware

The signed message for each AFiR receipt is the SHA-256 hash of the canonical JSON serialization [RFC8785] of the receipt payload fields: input_hash, output_hash, model_id, model_fingerprint, tool_name (if applicable), session_id, iat, nullifier.

3.2. Performance Characteristics

AFiR measured performance on commodity serverless infrastructure (2026):

  • Signing overhead per fragment: 0.785ms
  • End-to-end median wall latency: 241ms
  • On-chain receipt anchoring: approximately 7ms (Base Mainnet via USDC)
  • Throughput cost vs. baseline: 98.5% cheaper (tiered routing)
  • Speed vs. prior signing approach: 6.1x faster (223ms vs 1,369ms P50 wall-clock)

These measurements are from production traffic and represent the overhead of the complete AFiR signing pipeline including on-chain anchoring.

3.3. On-Chain Anchoring

AFiR anchors the Merkle root of each session's inference chain on Base Mainnet via a USDC transfer carrying the root hash as calldata. This provides:

  • Tamper-evident timestamp from a public, decentralized ledger
  • Independence from any single operator's infrastructure
  • Permanent, publicly auditable record of the session root
  • Approximately 7ms latency from signing to on-chain confirmation

The on-chain anchor does not contain individual receipt payloads. Per-entry proof retrieval uses the inference registry URI, following the same architecture as defined in [I-D.draft-mw-spice-inference-chain] Section 5.

4. AFiR Entry Structure

4.1. Common Fields (SPICE-Compatible)

AFiR entries include all REQUIRED common fields from [I-D.draft-mw-spice-inference-chain] Section 4.1. The entry type value is afir_pq_signature.

4.2. AFiR-Specific Fields

input_hash:
SHA-256 hash of the inference input (prompt or tool call parameters).
nullifier:
Unique non-reusable identifier for this receipt. Format: hex string, 32 bytes.
algorithm:
Signature algorithm used. One of: "ML-DSA-65" (primary, post-quantum), "ML-DSA-44" (compact, post-quantum), "Ed25519" (classical, transition support), "SLH-DSA" (reserved, FIPS 205), "FN-DSA" (reserved, FIPS 206).
public_key_hint:
First 16 bytes (hex) of the signing public key, for key disambiguation without transmitting the full key inline.
receipt_chain:
URI of the AFiR inference registry partition for this session.
on_chain_anchor:
Base Mainnet transaction hash containing the session Merkle root. OPTIONAL at entry level; REQUIRED in the token's inference_registry response for completed sessions.
phase:
For P1 (Signed Tool Calls): "before" or "after", indicating whether the receipt was produced before or after tool execution.

4.3. Full Entry Example

The following is an example AFiR inference chain entry for a signed tool call (P1, before phase):

{
  "type": "afir_pq_signature",
  "sub": "spiffe://thehiveryiq.com/agent/orchestrator",
  "model_fingerprint": "sha256:a3f9...",
  "model_id": "claude-opus-4-20260401",
  "input_hash": "sha256:b7c2...",
  "output_hash": "sha256:d4e1...",
  "intent_entry_ref": 2,
  "iat": 1749780000,
  "nullifier": "8a3f2c91b0e74d56a1f3c8b2e9d07f4a...",
  "algorithm": "ML-DSA-65",
  "public_key_hint": "79c1383bb1ba226d",
  "phase": "before",
  "receipt_chain":
    "https://api.thehiveryiq.com/afir/receipts/sess-uuid-12345",
  "on_chain_anchor": null,
  "inference_digest": "sha256:f8a3...",
  "inference_sig": "eyJhbGciOiJNTC1EU0EtNjUi..."
}

5. Five Signing Primitives

AFiR ships five production primitives, each corresponding to a distinct layer of the AI inference stack.

5.1. P1 -- Signed Tool Calls

Endpoints: POST /v1/afir/tool/sign and POST /v1/afir/tool/verify

P1 produces a before-and-after receipt for every MCP or Agent-to-Agent (A2A) tool invocation. The "before" receipt is produced before the tool executes, binding: tool_name, tool_version, input_hash, model_id, session_id, parent_receipt_nullifier, iat. The "after" receipt is produced after the tool returns, binding: output_hash, tool_exit_status, latency_ms, parent_receipt_nullifier (the nullifier of the "before" receipt), iat.

The nullifier chain from before to after ensures that a tool call receipt cannot be detached from its corresponding response receipt, and that replay of a valid before-receipt against a different tool response is detectable.

P1 directly addresses the unsigned tool invocation vulnerability class present in MCP deployments. The AFiR signing sidecar intercepts the call before the MCP transport layer, requiring no changes to MCP server implementations.

5.2. P2 -- Cross-Agent Receipt Trees

Endpoint: POST /v1/afir/tree/build

P2 implements the inference chain Merkle tree architecture defined in [I-D.draft-mw-spice-inference-chain] using AFiR receipt entries as leaf nodes. When Agent A calls Agent B which calls Agent C, P2 builds a Merkle tree across all receipts produced in the session. The root hash is the inference_root included in the OAuth token.

P2 is the AFiR reference implementation of the inference_root claim defined in [I-D.draft-mw-spice-inference-chain] Section 5.3. It is deployed and serving production traffic as of June 2026.

5.3. P3 -- KV Cache Signing

Endpoint: POST /v1/afir/cache/sign

P3 addresses a provenance gap not covered by the intent chain or the existing inference chain draft: the attestation of cached token prefixes served from distributed KV stores. In production agentic deployments using disaggregated prefill architectures, KV cache hit rates exceeding 90% have been measured. This means the majority of tokens served to the model in high-cache-hit deployments have no provenance attestation.

P3 signs each KV cache entry at write time and validates the signature at read time before cached tokens are injected into the model's context. If a cached prefix does not match its receipt on retrieval, the request MUST fail before the prefix is injected into the model's context.

5.4. P4 -- Model Manifest

Endpoints: POST /v1/afir/manifest/publish and GET /v1/afir/manifest/{nullifier}

P4 provides TEE-free attestation of which model, which weights, and which quantization configuration served a given request. A Model Manifest is a signed document binding: model_id, model_fingerprint (SHA-256 of model weights plus architecture), quantization, serving_runtime, infrastructure, iat, and nullifier.

The Model Manifest nullifier is included in all subsequent AFiR receipt entries produced during a session, creating a binding between every inference receipt and the specific model configuration that produced it.

P4 addresses the Model Masquerading attack class identified in [I-D.draft-mw-spice-inference-chain] Section 1.1 without requiring TEE hardware. The trust basis is the operator's key management rather than hardware isolation. P4 is therefore appropriate for environments where TEE is unavailable, with this distinction explicitly understood.

5.5. P5 -- Crypto-Agile Signature Layer

Endpoints: POST /v1/afir/sign and GET /v1/afir/algorithms

P5 implements a crypto-agile signing endpoint supporting multiple post-quantum and classical signature algorithms under a single API surface. The algorithm is specified per-request and recorded in the receipt entry, making receipts from different algorithm generations cross-verifiable via the Merkle structure.

Table 1: P5 Supported Algorithms
Algorithm Status Standard Notes
ML-DSA-65 Active NIST FIPS 204 Primary, post-quantum
ML-DSA-44 Active NIST FIPS 204 Compact, post-quantum
Ed25519 Active RFC 8032 Classical, transition support
SLH-DSA Reserved NIST FIPS 205 Planned
FN-DSA Reserved NIST FIPS 206 Planned

Algorithm negotiation follows the same model as TLS cipher suite negotiation. When a customer needs to upgrade from ML-DSA-65 to a future algorithm, they change a single configuration field. Prior receipts remain verifiable under their original algorithm.

6. Merkle Tree Compatibility

AFiR receipt entries are structurally compatible with the SPICE inference chain Merkle tree defined in [I-D.draft-mw-spice-inference-chain] Section 5.2. Leaf nodes are SHA-256 hashes of canonically serialized AFiR receipt entries (JSON Canonicalization Scheme [RFC8785]). The Merkle tree construction algorithm is identical to that defined in [I-D.draft-mw-spice-intent-chain] Section 5.3.

The resulting inference_root is included in the OAuth token using the claim structure defined in [I-D.draft-mw-spice-inference-chain] Section 5.3, with inference_proof_type set to afir_ml_dsa_65 (see Section 11).

7. Token Structure

A token carrying an AFiR inference chain follows the full Truth Stack structure defined in [I-D.draft-mw-spice-inference-chain] Section 6, with inference_proof_type set to an AFiR algorithm identifier:

{
  "iss": "https://auth.example.com",
  "sub": "user-alice",
  "aud": "https://api.example.com",
  "jti": "tok-afir-12345",
  "sid": "sess-uuid-12345",
  "iat": 1749780000,
  "exp": 1749783600,

  "actor_chain": [ "..." ],

  "intent_root": "sha256:abc123...",
  "intent_registry": "https://intent-log.example.com/...",

  "inference_root": "sha256:xyz789...",
  "inference_proof_type": "afir_ml_dsa_65",
  "inference_registry":
    "https://api.thehiveryiq.com/afir/receipts/sess-uuid-12345"
}

8. Tiered Verification with AFiR

AFiR extends the tiered verification strategy from [I-D.draft-mw-spice-inference-chain] Section 7.4:

Table 2: AFiR Tiered Verification
Risk Level Actor Chain Intent Chain Inference Chain
Low Sync Skip Skip
Medium Sync Cached proof AFiR signature check (<1ms)
High Sync Full AFiR + on-chain anchor (~7ms)
Critical Sync Full AFiR + on-chain + ZKML/TEE

9. Coexistence with ZKML and TEE Entries

AFiR entries and ZKML/TEE entries MAY coexist in the same inference chain. The SPICE Inference Chain Merkle tree is agnostic to the proof type of individual entries; the root hash covers all entries regardless of type. Verifiers MUST check the "type" field of each entry and apply the verification procedure appropriate to that type.

This is useful for deployments that use AFiR for real-time signing during inference and generate ZKML proofs asynchronously for high-value operations, or that run some agents on TEE-equipped hardware and others on commodity infrastructure.

10. Security Considerations

10.1. Post-Quantum Security Basis

ML-DSA-65 is secure under the hardness of the Module Learning With Errors (MLWE) problem, which is believed to be hard for both classical and quantum computers. NIST standardized ML-DSA-65 in FIPS 204 [FIPS204] (August 2024) following an eight-year public evaluation process. The security basis of AFiR signatures is mathematical (lattice hardness), not hardware-rooted. Both trust bases are valid; they are appropriate for different deployment contexts and threat models.

10.2. On-Chain Anchoring and Tamper Evidence

The Base Mainnet on-chain anchor provides tamper evidence independent of AFiR operator infrastructure. An adversary wishing to forge an AFiR receipt for a past session must either forge an ML-DSA-65 signature (computationally infeasible under MLWE hardness) or rewrite Base Mainnet history (computationally infeasible under proof-of-stake consensus). Neither is feasible under standard assumptions.

10.3. Threat Coverage Compared to ZKML and TEE

Table 3: Threat Coverage by Proof Type
Threat ZKML TEE AFiR
Model substitution Yes Yes P4
Weight tampering Yes Yes P4
Environment spoofing No Yes No*
Replay of stale proofs Yes Yes Yes
Tool call repudiation No No P1
Cache poisoning No No P3
Cross-agent chain break No No P2
Output repudiation Yes Yes Yes

* AFiR does not provide hardware-rooted proof that inference ran inside an isolated enclave. For deployments requiring environment isolation proof, TEE entries SHOULD be used for the relevant chain segments, potentially coexisting with AFiR entries as described in Section 9.

10.4. Key Management

AFiR signing keys MUST be generated as ML-DSA-65 key pairs per FIPS 204, stored in a key management system with access logging, rotated on a configurable schedule (90 days RECOMMENDED), and bound to a single operator identity per key pair. Public keys SHOULD be published in a discoverable registry to allow verifiers to retrieve the full public key given the public_key_hint in an AFiR receipt entry.

11. IANA Considerations

This document requests registration of the following inference_proof_type values for use with the inference_root claim defined in [I-D.draft-mw-spice-inference-chain]:

No new JWT claims are defined by this document. The existing inference_root, inference_proof_type, and inference_registry claims defined in [I-D.draft-mw-spice-inference-chain] are used without modification.

12. References

12.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, , <https://www.rfc-editor.org/info/rfc8174>.
[RFC8785]
Rundgren, A., Jordan, B., and S. Erdtman, "JSON Canonicalization Scheme (JCS)", RFC 8785, , <https://www.rfc-editor.org/info/rfc8785>.
[FIPS204]
National Institute of Standards and Technology, "Module-Lattice-Based Digital Signature Standard", NIST FIPS 204, , <https://csrc.nist.gov/pubs/fips/204/final>.
[I-D.draft-mw-spice-inference-chain]
Krishnan, R., Prasad, A., Lopez, D., and S. Addepalli, "Cryptographically Verifiable Inference Chain for AI Agent Computational Provenance", Work in Progress, Internet-Draft, draft-mw-spice-inference-chain-00, , <https://datatracker.ietf.org/doc/html/draft-mw-spice-inference-chain-00>.
[I-D.draft-mw-spice-actor-chain]
Prasad, A., Krishnan, R., Lopez, D., and S. Addepalli, "Cryptographically Verifiable Actor Chains for OAuth 2.0 Token Exchange", Work in Progress, Internet-Draft, draft-mw-spice-actor-chain-05, , <https://datatracker.ietf.org/doc/html/draft-mw-spice-actor-chain-05>.
[I-D.draft-mw-spice-intent-chain]
Krishnan, R., Prasad, A., Lopez, D., and S. Addepalli, "Cryptographically Verifiable Intent Chain for AI Agent Content Provenance", Work in Progress, Internet-Draft, draft-mw-spice-intent-chain-00, , <https://datatracker.ietf.org/doc/html/draft-mw-spice-intent-chain-00>.

12.2. Informative References

[RFC9334]
Birkholz, H., "Remote ATtestation procedureS (RATS) Architecture", RFC 9334, , <https://www.rfc-editor.org/info/rfc9334>.

Author's Address

Steve Rotzin
Hive / AFiR