| Internet-Draft | SPICE-INFERENCE-CHAIN | March 2026 |
| Krishnan, et al. | Expires 19 September 2026 | [Page] |
This document defines the inference_root claim as a companion to the actor_chain claim ({{!I-D.draft-mw-spice-actor-chain}}) and the intent_root claim ({{!I-D.draft-mw-spice-intent-chain}}). While the actor chain addresses delegation provenance (WHO) and the intent chain addresses content provenance (WHAT), the inference chain addresses computational provenance (HOW) — providing cryptographic proof that a claimed AI model actually performed the inference that produced a given output.¶
The inference chain leverages two complementary mechanisms: Zero-Knowledge Machine Learning (ZKML) proofs for mathematical certainty, and Trusted Execution Environment (TEE) attestation quotes for production-scale AI workloads. The full inference chain is stored as ordered logs, with only the Merkle root included in the OAuth token for efficiency.¶
Together, the three chains — actor, intent, and inference — form a complete "Truth Stack" for autonomous AI agent governance.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 19 September 2026.¶
Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The Actor Chain ({{!I-D.draft-mw-spice-actor-chain}}) proves WHO delegated to whom. The Intent Chain ({{!I-D.draft-mw-spice-intent-chain}}) proves WHAT was produced and how it was transformed. Neither addresses a critical remaining gap: was the claimed computational process actually used to produce the output?¶
In AI agent workflows, an agent may claim to use a high-capability, safety-aligned model (e.g., a frontier model with RLHF alignment) while actually running a cheaper, unaligned model. This "Model Masquerading" attack is undetectable by content hashes alone — the output hash in the intent chain proves the output exists, but not that the claimed model produced it.¶
Concrete threats:¶
This specification completes the three-axis "Truth Stack":¶
| Specification | Axis | Question Answered | STRIDE Coverage |
|---|---|---|---|
| Actor Chain ({{!I-D.draft-mw-spice-actor-chain}}) | Identity | WHO delegated to whom? | Spoofing, Repudiation, Elevation of Privilege |
| Intent Chain ({{!I-D.draft-mw-spice-intent-chain}}) | Content | WHAT was produced and transformed? | Repudiation, Tampering |
| Inference Chain (this document) | Computation | HOW was the output computed? | Spoofing (computational), Tampering (model) |
| Chain | Plane | Token Content | Full Chain | Primary Consumer |
|---|---|---|---|---|
| Actor | Data Plane | Full chain inline | In token | Every Relying Party (real-time authorization) |
| Intent | Audit Plane | Merkle root only | External registry | Audit systems, forensic investigators |
| Inference | Audit Plane | Merkle root only | External registry | Auditors, compliance systems |
The three chains are independent and composable:¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 {{!RFC2119}} {{!RFC8174}} when, and only when, they appear in all capitals, as shown here.¶
inference_root claim.¶
The inference chain supports two primary proof types, each with different trust assumptions and performance characteristics:¶
| Property | ZKML Proof | TEE Quote |
|---|---|---|
| Trust Basis | Mathematics (cryptographic hardness) | Hardware manufacturer root of trust |
| Verification | Deterministic, no external dependency | Requires manufacturer PKI (Intel/AMD/NVIDIA) |
| Model Scale | Currently limited (~100M parameters) | Production-scale LLMs (100B+ parameters) |
| Proof Generation | Minutes to hours per inference | Real-time (millisecond overhead) |
| Proof Size | ~200 bytes (Groth16), ~50KB (STARKs) | ~2KB (Intel TDX), ~4KB (NVIDIA) |
| Privacy | Zero-knowledge: hides model weights | Enclave-bound: hardware isolation |
Deployments SHOULD select the proof type based on their performance requirements and trust model.¶
ZKML proofs provide mathematical certainty that a specific model produced a specific output. They are constructed by encoding the neural network's forward pass as an arithmetic circuit and generating a zero-knowledge proof of correct execution.¶
Proof Statement: "There exist model weights W such that hash(W) = model_fingerprint AND forward_pass(W, input) = output."¶
Properties:¶
Applicable Proof Systems:¶
| System | Proof Size | Verification Time | Trusted Setup |
|---|---|---|---|
| Groth16 | ~200 bytes | 5ms | Required (per-circuit) |
| PLONK | ~400 bytes | 10ms | Universal (one-time) |
| STARKs | 50KB | 50ms | None (transparent) |
| Halo2 | 5KB | 15ms | None (recursive) |
Entry Structure:¶
{
"type": "zkml_proof",
"sub": "spiffe://example.com/agent/analyst",
"proof_system": "groth16",
"model_fingerprint": "sha256:weights_hash...",
"model_id": "analyst-model-v3.2",
"input_hash": "sha256:input_data...",
"output_hash": "sha256:output_data...",
"intent_entry_ref": 3,
"proof": "base64:proof_bytes...",
"verification_key_hash": "sha256:vk_hash...",
"verification_key_registry":
"https://vk-registry.example.com/keys/analyst-v3.2",
"iat": 1700000030,
"inference_digest": "sha256:...",
"inference_sig": "eyJhbGci..."
}
¶
TEE attestation quotes provide hardware-rooted proof that inference was performed within a verified enclave. The hardware generates a signed "quote" containing measurements of the code, data, and configuration running inside the enclave.¶
Proof Statement: "Hardware H with platform certificate C attests that enclave E with measurements M produced output O from input I."¶
Properties:¶
Supported Platforms:¶
| Platform | TEE Technology | GPU Support | Quote Format |
|---|---|---|---|
| Intel | TDX (Trust Domain Extensions) | Via NVIDIA H100 CC | DCAP Quote v4 |
| AMD | SEV-SNP | Via NVIDIA H100 CC | Attestation Report |
| NVIDIA | H100 Confidential Computing | Native | GPU Attestation |
| ARM | CCA (Confidential Compute Architecture) | Planned | CCA Token |
Entry Structure:¶
{
"type": "tee_attestation",
"sub": "spiffe://example.com/agent/analyst",
"platform": "nvidia_h100_cc",
"model_fingerprint": "sha256:weights_hash...",
"model_id": "analyst-model-v3.2",
"input_hash": "sha256:input_data...",
"output_hash": "sha256:output_data...",
"intent_entry_ref": 3,
"quote": {
"format": "nvidia_gpu_attestation_v1",
"enclave_measurement": "sha384:enclave_mr...",
"firmware_version": "H100.96.00.5E.00.01",
"platform_cert_chain": [
"base64:nvidia_device_cert...",
"base64:nvidia_intermediate_cert..."
],
"report_data": "sha256:binding_hash...",
"signature": "base64:hw_signature..."
},
"por_ref": "spiffe://example.com/wia/gpu-node-7",
"iat": 1700000030,
"inference_digest": "sha256:...",
"inference_sig": "eyJhbGci..."
}
¶
For high-assurance deployments, both proof types MAY be combined for a single inference operation:¶
The inference chain entry for a hybrid proof references both:¶
{
"type": "hybrid_proof",
"sub": "spiffe://example.com/agent/analyst",
"tee_entry_ref": 4,
"zkml_entry_ref": 5,
"model_fingerprint": "sha256:weights_hash...",
"output_hash": "sha256:output_data...",
"intent_entry_ref": 3,
"iat": 1700000035,
"inference_digest": "sha256:...",
"inference_sig": "eyJhbGci..."
}
¶
All inference chain entries share common fields:¶
| Field | Type | Required | Description |
|---|---|---|---|
type
|
string | REQUIRED | Entry type: zkml_proof, tee_attestation, hybrid_proof
|
sub
|
string | REQUIRED | SPIFFE ID of the agent that performed inference |
model_fingerprint
|
string | REQUIRED | SHA-256 hash of model weights + architecture |
model_id
|
string | REQUIRED | Human-readable model identifier and version |
output_hash
|
string | REQUIRED | SHA-256 hash of inference output (MUST match intent chain entry) |
intent_entry_ref
|
number | REQUIRED | Index of corresponding entry in intent chain |
iat
|
number | REQUIRED | Timestamp when proof was generated |
inference_digest
|
string | REQUIRED | Cumulative hash for Merkle tree |
inference_sig
|
string | REQUIRED | Signature over inference_digest
|
The sub field identifies the agent that performed inference. It corresponds to the sub field of the matching actor chain entry ({{!I-D.draft-mw-spice-actor-chain}}).¶
The intent_entry_ref field binds each inference proof to a specific intent chain entry. The output_hash in the inference entry MUST match the output_hash (for agent outputs) in the referenced intent chain entry. This binding ensures that the computational proof corresponds to the actual content recorded in the intent chain.¶
Intent Chain: Inference Chain: ┌──────────────────┐ ┌──────────────────┐ │ Entry 0 │ │ │ │ type: non_det │◄────────│ Entry 0 │ │ output: sha:abc │ │ type: tee_quote │ │ │ │ output: sha:abc │ │ │ │ intent_ref: 0 │ └──────────────────┘ └──────────────────┘ ┌──────────────────┐ │ Entry 1 │ (no proof needed │ type: determ. │ for deterministic │ │ filters) └──────────────────┘ ┌──────────────────┐ ┌──────────────────┐ │ Entry 3 │ │ │ │ type: non_det │◄────────│ Entry 1 │ │ output: sha:def │ │ type: zkml │ │ │ │ output: sha:def │ │ │ │ intent_ref: 3 │ └──────────────────┘ └──────────────────┘¶
The inference registry stores immutable inference chain entries as ordered logs, following the same architecture as the intent registry.¶
Properties:¶
Inference registry entries MUST NOT contain OAuth tokens, bearer credentials, or signing keys. Entries contain only inference proofs, content hashes, metadata, and agent identities. The token references the registry via the
inference_registryclaim; the registry MUST NOT store or reference the token itself.¶
Implementations SHOULD use an append-only log with tamper-evident guarantees and partitioned, ordered retrieval by session ID.¶
Log Structure:¶
{
"session_id": "sess-uuid-12345",
"offset": 0,
"entry": {
"type": "tee_attestation",
"sub": "spiffe://example.com/agent/A",
"model_fingerprint": "sha256:weights...",
"model_id": "agent-A-model-v2",
"output_hash": "sha256:abc...",
"intent_entry_ref": 0,
"quote": { "..." : "..." },
"iat": 1700000010,
"inference_digest": "sha256:...",
"inference_sig": "eyJ..."
}
}
¶
The inference chain Merkle tree follows the same construction algorithm as the intent chain (defined in {{!I-D.draft-mw-spice-intent-chain}} Section 5.3). Leaf nodes are the SHA-256 hashes of canonically serialized inference chain entries. The resulting root hash is included in the OAuth token as the inference_root claim.¶
Only the Merkle root is included in the OAuth token:¶
{
"inference_root": "sha256:xyz789...",
"inference_proof_type": "groth16",
"inference_registry":
"https://proof-log.example.com"
}
¶
| Field | Type | Required | Description |
|---|---|---|---|
inference_root
|
string | REQUIRED | Merkle root hash of inference chain |
inference_proof_type
|
string | OPTIONAL | Primary proof algorithm used (e.g., groth16, tee_tdx, hybrid) |
inference_registry
|
string | REQUIRED | URI of inference registry for proof retrieval |
The complete token combines session, actor chain, intent chain, and inference chain:¶
{
"iss": "https://auth.example.com",
"sub": "user-alice",
"aud": "https://api.example.com",
"jti": "tok-eee-12345",
"sid": "sess-uuid-12345",
"iat": 1700000000,
"exp": 1700003600,
"session": {
"session_id": "sess-uuid-12345",
"type": "human_initiated",
"initiator": "user-alice",
"approval_ref": "approval-uuid-789",
"max_chain_depth": 5
},
"actor_chain": [
{
"sub":
"spiffe://example.com/agent/orchestrator",
"iss": "https://auth.example.com",
"iat": 1700000010
},
{
"sub":
"spiffe://example.com/agent/analyst",
"iss": "https://auth.example.com",
"iat": 1700000030
}
],
"intent_root": "sha256:abc123...",
"intent_registry":
"https://intent-log.example.com/sessions/sess-uuid-12345",
"inference_root": "sha256:xyz789...",
"inference_proof_type": "tee_h100",
"inference_registry":
"https://proof-log.example.com/sessions/sess-uuid-12345"
}
¶
| Claim | Type | Description |
|---|---|---|
inference_root
|
string | Merkle root hash of inference chain |
inference_proof_type
|
string | Primary proof algorithm (e.g., groth16, tee_tdx, tee_h100, hybrid) |
inference_registry
|
string | URI for retrieving full inference chain or proofs (REQUIRED) |
A Relying Party receiving a token with all three chains performs a "Three-Point Check":¶
To verify a ZKML inference entry:¶
verification_key_registry.¶
hash(verification_key) == verification_key_hash.¶
verify(proof, verification_key, public_inputs) where public_inputs = (model_fingerprint, input_hash, output_hash).¶
output_hash matches the corresponding intent chain entry.¶
To verify a TEE attestation entry:¶
enclave_measurement matches the expected measurement for the claimed model.¶
report_data binds the quote to the specific input/output hashes.¶
output_hash matches the corresponding intent chain entry.¶
Inference verification extends the tiered strategy from {{!I-D.draft-mw-spice-intent-chain}}:¶
| Risk Level | Actor Chain | Intent Chain | Inference Chain | Use Case |
|---|---|---|---|---|
| Low | Sync | Skip | Skip | Read operations |
| Medium | Sync | Cached proof | Skip | Create/update |
| High | Sync | Full | TEE quote check | Financial decisions |
| Critical | Sync | Full | Full ZKML + TEE | Regulatory, high-stakes |
require_inference_proofs {
intent_chain := get_intent_chain(
input.intent_root)
inference_chain := get_inference_chain(
input.inference_root)
agent_outputs := [i |
intent_chain[i].type == "non_deterministic"]
every i in agent_outputs {
some j
inference_chain[j].intent_entry_ref == i
}
}
¶
require_tee_for_finance {
input.action in ["transfer", "approve",
"underwrite"]
inference_chain := get_inference_chain(
input.inference_root)
every entry in inference_chain {
entry.type in ["tee_attestation",
"hybrid_proof"]
}
}
¶
block_deprecated_models {
inference_chain := get_inference_chain(
input.inference_root)
every entry in inference_chain {
not entry.model_id in
data.deprecated_models
}
}
¶
The inference chain claims are designed for consumption by policy engines such as Open Policy Agent (OPA). A policy engine SHOULD:¶
inference_root and inference_registry are present and non-empty.¶
| Threat | Attack | Mitigation |
|---|---|---|
| Spoofing (Computational) | Model substitution |
model_fingerprint binds proof to specific weights |
| Tampering (Model) | Weight modification | Hash of weights in proof statement |
| Repudiation | "I didn't run that model" | Signed inference entries with agent SPIFFE ID |
| Replay | Reuse old proof for new output |
input_hash + output_hash binding + iat timestamp |
| Environment Spoofing | Claim TEE without using one | TEE quote with hardware-signed measurements |
For ZKML proofs, the verification key uniquely identifies the circuit (and therefore the model architecture). Verification keys:¶
Inference proofs MUST include an iat timestamp. Verifiers SHOULD reject proofs older than a configurable maximum age. For TEE quotes, the platform firmware version SHOULD be checked against the manufacturer's latest TCB baseline to ensure the hardware is patched.¶
Inference registry entries MUST NOT contain OAuth tokens, bearer credentials, or signing keys. The relationship between tokens and registry entries is one-directional: the token references the registry via the inference_registry URI claim, but the registry MUST NOT store or reference the token. This separation ensures that compromise of the inference registry does not expose bearer credentials.¶
STARK-based proofs can be tens of kilobytes. The Merkle tree architecture ensures this does not affect token size (only the root hash is in the JWT). However, deployments using STARKs SHOULD consider:¶
The inference chain's TEE attestation entries are complementary to, but distinct from, the Proof of Residency (PoR) in actor chain entries ({{!I-D.draft-mw-spice-transitive-attestation}}):¶
Both MAY reference the same underlying hardware, but serve different verification purposes. The por_ref field in TEE attestation entries allows cross-referencing.¶
ZKML proofs provide zero-knowledge privacy for model weights — the verifier learns that the output was produced by a model with a specific fingerprint, but learns nothing about the actual weights. This is valuable for proprietary models where the weights are confidential.¶
TEE attestation provides enclave isolation — the model weights are protected from the host OS, but the enclave measurement reveals information about the model binary.¶
Both proof types include input_hash rather than the raw input, providing input privacy by default. For applications requiring stronger guarantees, ZKML proofs can be constructed to hide the input entirely (proving only that "some valid input" produced the output).¶
Inference chain entries MAY use Selective Disclosure (SD-JWT) {{!I-D.ietf-oauth-selective-disclosure-jwt}} to hide sensitive fields such as model_id or platform from certain verifiers while maintaining proof integrity.¶
As of this writing, ZKML is practical for models up to approximately 100 million parameters. Active research is scaling this boundary through optimized circuit designs and recursive proof composition.¶
Deployments SHOULD plan for ZKML scalability improvements and MAY use TEE attestation as a near-term bridge.¶
For TEE-based inference proofs:¶
report_data field MUST contain sha256(input_hash || output_hash) to bind the proof to specific content.¶
In deployments involving multiple Authorization Servers, the inference registry is shared across all participating ASes under the same session partition. Each AS appends inference chain entries under the sid partition established at session initiation. The inference_root in each successive token is recomputed over all proof entries accumulated so far — it therefore differs at each hop as the chain grows. This is expected behavior: a relying party receiving a later token will see a larger inference_root than one receiving an earlier token in the same session. ASes do not need to share keys or coordinate directly — the session partition and append-only log semantics provide the necessary consistency, following the same pattern as the Actor Chain Registry ({{!I-D.draft-mw-spice-actor-chain}}) and Intent Registry ({{!I-D.draft-mw-spice-intent-chain}}).¶
Inference registry unavailability does not affect data-plane operation — the token's AS-signed inference_root is sufficient for request-time policy decisions. Per-entry proof verification is deferred to the audit plane.¶
However, inference proofs are uniquely valuable because they cannot be regenerated after the fact — the exact model state, input, and execution environment may no longer be available. Deployments SHOULD:¶
A federated IAM/IdM platform (e.g., Keycloak, Microsoft Entra, Okta, PingFederate) MAY host the inference registry alongside the Actor Chain Registry and Intent Registry under the same session partition — see {{!I-D.draft-mw-spice-actor-chain}} Section "Registry Hosting" for detailed requirements.¶
Inference proof payloads are significantly larger than actor chain entries (~0.5-1KB) or intent chain entries (0.5-1KB):¶
| Proof Type | Typical Size | Storage Implication |
|---|---|---|
| TEE attestation quote | 2-4KB | Compatible with most IAM data stores |
| Groth16 ZK proof | ~200 bytes | Compatible with most IAM data stores |
| STARK proof | 50KB | May exceed IAM data store blob limits |
Most IAM/IdM platforms are not designed to store large binary blobs. Deployments that include STARK proofs SHOULD either:¶
Deployments SHOULD select proof types based on their latency requirements:¶
The inference chain follows the same Merkle root architecture as the intent chain (see {{!I-D.draft-mw-spice-intent-chain}} Design Rationale for the detailed comparison). The same trade-offs apply, with additional motivation: inference proofs are large (STARK proofs ~50KB, TEE quotes ~2-4KB), making inline embedding in tokens impractical. The Merkle root enables selective verification of individual proofs using O(log n) sibling hashes, which is critical for the tiered verification strategy.¶
This document requests registration of the following claims in the "JSON Web Token Claims" registry established by {{!RFC7519}}:¶
inference_root¶
Specification Document(s): [this document]¶
Claim Name: inference_proof_type¶
Claim Description: Primary proof algorithm used in the inference chain.¶
Change Controller: IETF¶
Specification Document(s): [this document]¶
Claim Name: inference_registry¶
Claim Description: URI of the inference registry for proof retrieval.¶
Change Controller: IETF¶
Specification Document(s): [this document]¶
This document requests registration of the following claims in the "CBOR Web Token (CWT) Claims" registry established by {{!RFC8392}}:¶
inference_root¶
Specification Document(s): [this document]¶
Claim Name: inference_registry¶
Claim Description: URI of the inference registry for proof retrieval.¶
CBOR Key: TBD (e.g., 61)¶
Claim Type: tstr¶
Change Controller: IETF¶
Specification Document(s): [this document]¶
Claim Name: inference_proof_type¶
Claim Description: Primary proof algorithm used in the inference chain.¶
CBOR Key: TBD (e.g., 62)¶
Claim Type: tstr¶
Change Controller: IETF¶
Specification Document(s): [this document]¶
{
"iss": "https://auth.example.com",
"sub": "user-alice",
"aud": "https://api.example.com",
"jti": "tok-fff-67890",
"iat": 1700000000,
"exp": 1700003600,
"session": {
"session_id": "sess-uuid-12345",
"type": "human_initiated",
"initiator": "user-alice",
"approval_ref": "approval-uuid-789",
"max_chain_depth": 5
},
"actor_chain": [
{
"sub":
"spiffe://example.com/agent/orchestrator",
"iss": "https://auth.example.com",
"iat": 1700000010,
"scope": "ticket:*",
"chain_digest": "sha256:aaa...",
"chain_sig": "eyJhbGci..."
},
{
"sub":
"spiffe://example.com/agent/analyst",
"iss": "https://auth.example.com",
"iat": 1700000030,
"scope": "analysis:run",
"chain_digest": "sha256:bbb...",
"chain_sig": "eyJhbGci..."
}
],
"intent_root": "sha256:abc123...",
"intent_registry":
"https://intent-log.example.com/sessions/sess-uuid-12345",
"inference_root": "sha256:xyz789...",
"inference_proof_type": "tee_h100",
"inference_registry":
"https://proof-log.example.com/sessions/sess-uuid-12345"
}
¶