<?xml version="1.0" encoding="utf-8"?>
<!-- name="GENERATOR" content="github.com/mmarkdown/mmark Mmark Markdown Processor - mmark.miek.nl" -->
<rfc version="3" ipr="trust200902" docName="draft-xkumakichi-xaip-receipts-03" submissionType="independent" category="info" xml:lang="en" xmlns:xi="http://www.w3.org/2001/XInclude" indexInclude="true">

<front>
<title abbrev="XAIP Receipts">Signed Execution Receipts for AI Agent Tool Calls (XAIP Receipts)</title><seriesInfo value="draft-xkumakichi-xaip-receipts-03" stream="independent" status="informational" name="Internet-Draft"></seriesInfo>
<author fullname="xkumakichi"><organization></organization><address><postal><street></street>
</postal><email>kuma.github@gmail.com</email>
</address></author><date year="2026" month="July" day="2"></date>
<area>Applications and Real-Time</area>
<workgroup>Independent Submission</workgroup>
<keyword>AI agents</keyword>
<keyword>tool calls</keyword>
<keyword>signed receipts</keyword>
<keyword>co-signed receipts</keyword>
<keyword>trust</keyword>
<keyword>DID</keyword>

<abstract>
<t>This document defines a wire format for signed execution receipts produced by AI agents when they invoke tools, services, or other agents. A receipt records the minimum facts needed to make a trust decision about a future call: who acted, who delegated, what tool was used, whether the call succeeded, how long it took, and how the call's inputs and outputs are identified (without disclosing their contents).</t>
<t>A distinguishing property of the format is optional caller co-signature over the same canonical per-call record. When both signatures validate, the receipt cryptographically binds the identified Caller and Agent to the same canonical payload. This does not establish that either party independently observed every field, that the recorded execution was correct, or that the parties did not collude. A receipt may carry the Agent signature alone, and consumers may distinguish the two cases according to deployment policy.</t>
<t>The format is intentionally tool-system-agnostic. The same receipt structure can be emitted by MCP (Model Context Protocol) servers, LangChain.js callback handlers, OpenAI tool-calling loops, HTTP clients, or proprietary agent runtimes. Receipts use Ed25519 signatures over a JCS-canonicalized payload, and identities are W3C Decentralized Identifiers (DIDs).</t>
<t>This revision introduces an explicit wire-format version (formatVersion), pins the hash preimage profile, and ships executable conformance test vectors.</t>
<t>Scoring policy, aggregation architecture, and reactive behavior in response to receipts are explicitly out of scope and left to deployments.</t>
</abstract>

</front>

<middle>

<section anchor="introduction"><name>Introduction</name>

<section anchor="motivation"><name>Motivation</name>
<t>AI agents increasingly act on behalf of users: they pick tools, call APIs, delegate to other agents, and -- in some deployments -- participate in transaction workflows. Each of those actions is preceded by an implicit trust decision: which tool should I use, and is it likely to do what I expect?</t>
<t>Today, that decision is mostly answered by upstream proxies -- whether the tool's name appears in a model's training data, whether a registry surfaces it, whether a platform recommends it. None of these proxies record what the tool actually did in real calls. There is no widely-deployed, interoperable record format that an agent (or an agent-payment protocol, or an audit system) can use to look back and answer &quot;what happened the last N times this tool was called?&quot;</t>
<t>This document defines such a format. It is intentionally narrow: it covers the wire format for one receipt. How receipts are stored, aggregated, queried, scored, or reacted to is a deployment-policy concern and is out of scope.</t>
</section>

<section anchor="design-principles"><name>Design Principles</name>

<ul spacing="compact">
<li>Wire format only. Scoring models, aggregation topologies, and decision logic are deployment choices, not protocol requirements.</li>
<li>Tool-system-agnostic. The same receipt can be produced by MCP, LangChain, OpenAI tool calling, plain HTTP, or proprietary runtimes.</li>
<li>Content-minimizing. Receipts identify inputs and outputs by hash rather than embedding their contents. This reduces direct disclosure but does not make low-entropy or guessable values private; see the Security Considerations.</li>
<li>Independently verifiable. Anyone holding the receipt and the public keys can verify the signatures without consulting any registry or trusted third party.</li>
<li>Mechanically checkable. Every processing rule in this document is exercised by the executable test vectors described in the appendices; a conforming implementation can be validated byte-for-byte without interpreting prose.</li>
<li>Co-signature supported. The Caller can sign the same canonical payload as the Executor (Agent). A receipt verifies as co-signed only when valid signatures for both the identified Caller and Agent are present. The degree to which those signatures represent operationally independent assertions depends on the control separation described in the Security Considerations.</li>
</ul>
</section>

<section anchor="out-of-scope"><name>Out of Scope</name>
<t>This document does NOT define:</t>

<ul spacing="compact">
<li>A scoring model. Trust scores derived from receipts are deployment policy.</li>
<li>An aggregation architecture. Receipts can be stored locally, federated, anchored, or relayed in any pattern.</li>
<li>A query API. Consumers may serve receipts and/or derived data over any protocol they choose.</li>
<li>Identity priors. If a deployment chooses to weight different DID methods differently, that is deployment policy.</li>
<li>A specific transport. Receipts may be exchanged over HTTP, MCP, message queues, or any other carrier.</li>
</ul>
</section>

<section anchor="conventions-and-definitions"><name>Conventions and Definitions</name>
<t>The key words &quot;MUST&quot;, &quot;MUST NOT&quot;, &quot;REQUIRED&quot;, &quot;SHALL&quot;, &quot;SHALL NOT&quot;, &quot;SHOULD&quot;, &quot;SHOULD NOT&quot;, &quot;RECOMMENDED&quot;, &quot;NOT RECOMMENDED&quot;, &quot;MAY&quot;, and &quot;OPTIONAL&quot; in this document are to be interpreted as described in BCP 14 <xref target="RFC2119"></xref> <xref target="RFC8174"></xref> when, and only when, they appear in all capitals, as shown here.</t>

<section anchor="terminology"><name>Terminology</name>

<dl spacing="compact">
<dt>Agent:</dt>
<dd>An automated system, typically an AI agent, that invokes tools, services, or other agents on a principal's behalf.</dd>
<dt>Caller:</dt>
<dd>The party that delegated the tool call to the Agent. Often (but not always) the same legal entity as the Agent's principal.</dd>
<dt>Tool:</dt>
<dd>A named operation invoked by the Agent. The tool implementation may be local code, an MCP server, an HTTP API, a sub-agent, or any callable target.</dd>
<dt>Receipt:</dt>
<dd>A signed record of a single Tool execution attempt.</dd>
<dt>Executor signature:</dt>
<dd>The signature produced by the Agent that ran the tool.</dd>
<dt>Caller signature:</dt>
<dd>The signature produced by the Caller over the same canonicalized payload as the Executor signature.</dd>
<dt>DID:</dt>
<dd>Decentralized Identifier, as defined in the W3C DID Core specification <xref target="DID-CORE"></xref>.</dd>
<dt>Legacy receipt:</dt>
<dd>A receipt produced before this revision, carrying no <tt>formatVersion</tt> member (see <xref target="format-versioning"></xref>).</dd>
</dl>
</section>
</section>
</section>

<section anchor="receipt-structure"><name>Receipt Structure</name>
<t>A receipt is a JSON object with the following fields:</t>
<table>
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Description</th>
</tr>
</thead>

<tbody>
<tr>
<td><tt>formatVersion</tt></td>
<td>string</td>
<td>yes (this revision)</td>
<td>MUST be <tt>&quot;1&quot;</tt> for receipts conforming to this revision. Part of the canonical payload (<xref target="canonical-payload"></xref>). Receipts without this member are legacy receipts; see <xref target="format-versioning"></xref>.</td>
</tr>

<tr>
<td><tt>agentDid</tt></td>
<td>string (DID)</td>
<td>yes</td>
<td>The Agent that executed the tool.</td>
</tr>

<tr>
<td><tt>callerDid</tt></td>
<td>string (DID)</td>
<td>yes</td>
<td>The Caller that delegated the tool call. MUST be present on <tt>formatVersion</tt> <tt>&quot;1&quot;</tt> receipts; it MAY equal <tt>agentDid</tt> when there is no delegation.</td>
</tr>

<tr>
<td><tt>toolName</tt></td>
<td>string</td>
<td>yes</td>
<td>A stable identifier for the tool. Format is opaque to this spec.</td>
</tr>

<tr>
<td><tt>taskHash</tt></td>
<td>string (64-char hex, lowercase)</td>
<td>yes</td>
<td>A SHA-256 hash of the task input, computed per the preimage profile (<xref target="preimage-profile"></xref>).</td>
</tr>

<tr>
<td><tt>resultHash</tt></td>
<td>string (64-char hex, lowercase)</td>
<td>yes</td>
<td>A SHA-256 hash of the task output, computed per the preimage profile (<xref target="preimage-profile"></xref>). When <tt>success</tt> is <tt>false</tt> and no output is committed to, the value MUST be the empty-input sentinel (<xref target="preimage-profile"></xref>); when a canonical failure description is committed to instead, the preimage profile applies to it.</td>
</tr>

<tr>
<td><tt>success</tt></td>
<td>boolean</td>
<td>yes</td>
<td><tt>true</tt> if the tool call satisfied the agent's success criterion, <tt>false</tt> otherwise.</td>
</tr>

<tr>
<td><tt>latencyMs</tt></td>
<td>integer</td>
<td>yes</td>
<td>Wall-clock time from invocation to completion, in milliseconds. MUST be an integer in the range <eref target="the I-JSON exactly-representable range, which JCS serializes deterministically">0, 2^53 - 1</eref>.</td>
</tr>

<tr>
<td><tt>failureType</tt></td>
<td>string</td>
<td>yes</td>
<td>One of the values defined in <xref target="failure-type-classification"></xref> when <tt>success</tt> is <tt>false</tt>. When <tt>success</tt> is <tt>true</tt>, the value MUST be the empty string <tt>&quot;&quot;</tt>. The member is always present on the wire and in the canonical payload as a (possibly empty) string; <tt>null</tt> is not used and the member MUST NOT be omitted.</td>
</tr>

<tr>
<td><tt>timestamp</tt></td>
<td>string (RFC 3339)</td>
<td>yes</td>
<td>UTC timestamp of completion. <tt>formatVersion</tt> <tt>&quot;1&quot;</tt> producers SHOULD render it with exactly three fractional-second digits and the <tt>Z</tt> designator (e.g. <tt>2026-07-02T01:23:45.678Z</tt>); a single rendering keeps signature-level deduplication (<xref target="replay"></xref>) byte-stable. A future revision may make one rendering mandatory.</td>
</tr>

<tr>
<td><tt>signature</tt></td>
<td>string (hex)</td>
<td>yes</td>
<td>Ed25519 signature by the Agent over the canonical payload (<xref target="canonical-payload-and-signing"></xref>), encoded as exactly 128 lowercase hexadecimal characters (a 64-byte signature).</td>
</tr>

<tr>
<td><tt>callerSignature</tt></td>
<td>string (hex)</td>
<td>recommended</td>
<td>Ed25519 signature by the Caller over the same canonical payload, same encoding as <tt>signature</tt>. See <xref target="signingdelegate-pattern"></xref>.</td>
</tr>

<tr>
<td><tt>toolMetadata</tt></td>
<td>object</td>
<td>optional</td>
<td>Tool-class or capability hints. Format is deployment-defined; see <xref target="tool-metadata"></xref>. Never part of the canonical payload.</td>
</tr>
</tbody>
</table><t>A receipt SHOULD NOT carry top-level members other than those defined above. Consumers MUST treat any unknown top-level member as unauthenticated data: unknown members are not part of the canonical payload and are not covered by any signature. Producers MUST NOT rely on unknown members being preserved by intermediaries.</t>

<section anchor="example"><name>Example</name>
<t>The following receipt is complete and verifiable: both signatures validate against the test keys published with the conformance vectors (<xref target="test-vectors"></xref>), and every hash is reproducible from the preimages given there. The task input is the JSON object <tt>{&quot;text&quot;: &quot;hello&quot;, &quot;target&quot;: &quot;ja&quot;}</tt> and the output is a five-character Japanese text string whose exact value is given in the <tt>unicode_string_raw_utf8</tt> conformance vector.</t>

<sourcecode type="json"><![CDATA[{
  "formatVersion": "1",
  "agentDid": "did:web:translator.example",
  "callerDid": "did:web:orchestrator.example",
  "toolName": "translate",
  "taskHash":
"a1f15dbb98240bfcd2ae4e21497f0fc011e99397929d2836bff327ff09254103",
  "resultHash":
"125aeadf27b0459b8760c13a3d80912dfa8a81a68261906f60d87f4a0268646c",
  "success": true,
  "latencyMs": 142,
  "failureType": "",
  "timestamp": "2026-07-02T01:23:45.678Z",
  "signature": "1fbf11f917c6404db0476f4542f981b0bd00f3391ffa2a
859f1f700b520228ac13f6d1b8a09394012290830c908d7b7906c4e3498d9dc4
1f2badb8bc7de97905",
  "callerSignature": "5dadd87738832b65557072e15877ace0abe74f9c21
ea7c32cc64b838e4379f6e15f974ff05aa65f4adcc320c01a4b0d4dc72bc154f
cc12af7b696b485a51c40c"
}
]]></sourcecode>
<t>(The two signature values are single hex strings; they are wrapped here for line-length reasons only.)</t>
</section>
</section>

<section anchor="canonical-payload-and-signing"><name>Canonical Payload and Signing</name>

<section anchor="canonical-payload"><name>Canonical Payload</name>
<t>The signed payload is the JSON object containing the following fields, in this order after lexicographic sorting per <xref target="RFC8785"></xref>:</t>

<artwork><![CDATA[agentDid, callerDid, failureType, formatVersion, latencyMs,
resultHash, success, taskHash, timestamp, toolName
]]></artwork>
<t>For receipts conforming to this revision, <tt>formatVersion</tt> MUST be present with the value <tt>&quot;1&quot;</tt> and is part of the signed payload. For legacy receipts (<xref target="format-versioning"></xref>) the payload contains the nine remaining fields only.</t>
<t>The <tt>signature</tt>, <tt>callerSignature</tt>, and <tt>toolMetadata</tt> fields are excluded from the canonical payload. Implementations producing receipts MUST canonicalize using JCS as defined in <xref target="RFC8785"></xref>.</t>
<t>Verifiers MUST recompute the canonical payload from the receipt's field values exactly as received. In particular, verifiers MUST NOT apply Unicode normalization (NFC, NFD, NFKC, or NFKD) or any other transformation to field values before canonicalization; a storage or transport layer that normalizes strings will render legitimately signed receipts unverifiable. Producers SHOULD emit NFC-normalized strings. Compatibility normalization (NFKC/NFKD) MUST NOT be applied at any layer: it can fold visually or semantically distinct identifiers (for example, two different <tt>toolName</tt> values) into one.</t>
<t>The <tt>taskHash</tt> and <tt>resultHash</tt> fields MUST contain a SHA-256 digest as defined in <xref target="RFC6234"></xref>, encoded as exactly 64 lowercase hexadecimal characters. Because a receipt carries no hash-algorithm identifier, verifiers MUST interpret both fields as SHA-256 in this version. Supporting additional hash algorithms would require a future revision of the <tt>formatVersion</tt> mechanism (<xref target="format-versioning"></xref>).</t>
</section>

<section anchor="signing-algorithm"><name>Signing Algorithm</name>
<t>Signatures are computed using Ed25519, as defined in <xref target="RFC8032"></xref>. The signature input is the UTF-8 encoding of the canonical JSON string produced in the previous subsection.</t>
<t>The <tt>signature</tt> field is the Executor's Ed25519 signature, encoded as a lowercase hexadecimal string of exactly 128 characters. The <tt>callerSignature</tt> field, when present, is the Caller's Ed25519 signature over the same canonical input, in the same encoding.</t>
</section>

<section anchor="verification"><name>Verification</name>
<t>A verifier MUST:</t>

<ol spacing="compact">
<li>Recompute the canonical payload from the receipt's fields per <xref target="canonical-payload"></xref>.</li>
<li>Resolve <tt>agentDid</tt> to its current public key per <xref target="DID-CORE"></xref>.</li>
<li>Verify <tt>signature</tt> against the canonical payload using the Agent's public key.</li>
<li>If <tt>callerSignature</tt> is present, resolve <tt>callerDid</tt> similarly and verify <tt>callerSignature</tt> against the same canonical payload.</li>
<li>Reject the receipt if any signature verification fails.</li>
</ol>
<t>Verification of <tt>formatVersion</tt> <tt>&quot;1&quot;</tt> receipts is additionally fail-closed with respect to the wire format: a verifier MUST reject a <tt>formatVersion</tt> <tt>&quot;1&quot;</tt> receipt whose <tt>taskHash</tt> or <tt>resultHash</tt> is not exactly 64 lowercase hexadecimal characters, or whose <tt>failureType</tt> is inconsistent with <tt>success</tt> (<xref target="failure-type-classification"></xref>) -- regardless of whether its signatures are valid. A cryptographically valid signature over a malformed record does not make the record conformant. Verifiers SHOULD likewise validate the remaining MUST-level constraints of <xref target="receipt-structure"></xref> through <xref target="failure-type-classification"></xref> and reject violations.</t>
<t>A verifier MAY additionally validate that <tt>timestamp</tt> is within a deployment-defined freshness window.</t>
</section>

<section anchor="format-versioning"><name>Format Versioning and Legacy Receipts</name>
<t>The <tt>formatVersion</tt> member identifies which revision's processing rules apply to a receipt. The only value defined by this document is <tt>&quot;1&quot;</tt>, denoting: full 64-character hashes, the preimage profile of <xref target="preimage-profile"></xref>, <tt>toolMetadata</tt> excluded from the signed payload, <tt>failureType</tt> always present as a string, and <tt>callerDid</tt> always present.</t>
<t>A receipt without a <tt>formatVersion</tt> member is a legacy receipt, produced under revisions <tt>-00</tt> through <tt>-02</tt> of this document. Verifiers MAY accept legacy receipts by reconstructing the nine-field canonical payload (<xref target="canonical-payload"></xref>) and applying lenient handling; deployments SHOULD record which regime a stored receipt was accepted under. Two known properties of legacy receipts deserve care:</t>

<ul spacing="compact">
<li>Some legacy producers truncated <tt>taskHash</tt>/<tt>resultHash</tt> to 16 hexadecimal characters. A 64-bit digest is collision-findable with roughly 2^32 work and MUST NOT be produced anymore; consumers SHOULD NOT rely on truncated hashes as commitments.</li>
<li>Some legacy producers incorrectly included <tt>toolMetadata</tt> in the signed payload, contrary to the canonical-payload rule of every revision that defined it. Such receipts fail verification under this revision's payload rule.</li>
</ul>
<t>Because <tt>formatVersion</tt> is part of the signed payload, a receipt's version claim cannot be altered without invalidating its signatures. Verifiers encountering an unknown <tt>formatVersion</tt> value MUST reject the receipt.</t>
</section>

<section anchor="preimage-profile"><name>Hash Preimage Profile</name>
<t><tt>formatVersion</tt> <tt>&quot;1&quot;</tt> producers MUST compute <tt>taskHash</tt> and <tt>resultHash</tt> over the following preimages. This profile makes hashes of semantically identical values byte-identical across independent producers, which is what enables cross-deployment comparison and later disclosure verification.</t>

<dl spacing="compact">
<dt>Text (a string value):</dt>
<dd>The raw UTF-8 content bytes of the string itself -- not its JSON-string serialization. <tt>&quot;hello&quot;</tt> hashes the 5 bytes <tt>hello</tt> (digest <tt>2cf24dba...</tt>); a format that hashed the 7-byte JSON form (with quotation marks) would produce <tt>5aa762ae...</tt> instead.</dd>
<dt>Raw binary content:</dt>
<dd>Its raw bytes, directly. (Together with the text rule: scalar content hashes its content bytes.)</dd>
<dt>Structured JSON value (object, array, number, boolean):</dt>
<dd>The UTF-8 bytes of its JCS <xref target="RFC8785"></xref> canonical form. Member order therefore cannot change the hash.</dd>
<dt>Absent (no input; no output, e.g. on failure) or JSON <tt>null</tt>:</dt>
<dd>The empty byte string. SHA-256 of the empty string is the empty-input sentinel:
<tt>e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855</tt>.</dd>
</dl>
<t>Two consequences worth stating explicitly. First, hashing a structured value requires the same type-level discipline as the canonical payload itself: values must be JSON-representable and serialized by JCS, not by an implementation's default serializer. Second, the sentinel makes &quot;no output&quot; a first-class, verifiable state -- a failed call with no output has a well-defined <tt>resultHash</tt> rather than an empty or producer-invented value.</t>
<t>Semantic equivalence across different profiles (for example, comparing a hash produced under this profile with one produced by a format that hashes JSON-string serializations) remains out of scope; the <tt>formatVersion</tt> mechanism exists so that such profile changes are explicit rather than silent.</t>
</section>
</section>

<section anchor="signingdelegate-pattern"><name>SigningDelegate Pattern (Caller Co-signature)</name>
<t>To produce a co-signed receipt, a Caller MUST NOT transmit private key material to the Executor. Instead, the Caller exposes a SigningDelegate interface:</t>

<artwork><![CDATA[interface SigningDelegate {
  did: DIDString
  sign(payload: string): Promise<HexString>
}
]]></artwork>
<t>The Executor sends the canonical payload string to the Caller's <tt>sign</tt> method and receives the signature. The private key never leaves the Caller's process boundary.</t>
<t>When the Caller and Executor are not co-located, the transport carrying canonical payloads to the Caller MUST use TLS or an equivalent confidentiality and integrity layer.</t>
<t>A Caller MAY decline to sign -- for example, if the Caller does not consent to the receipt's contents. In that case the Executor publishes the receipt with only its own <tt>signature</tt> and no <tt>callerSignature</tt>. Such receipts remain syntactically valid; consumers may weight them differently as a matter of deployment policy.</t>
<t>Before signing, a conforming Caller SHOULD at minimum parse the canonical payload and confirm that <tt>callerDid</tt> identifies itself and that the payload corresponds to a delegation it actually issued (for example, by matching <tt>toolName</tt> and recomputing <tt>taskHash</tt> from the input it delegated). A Caller that signs whatever it is handed adds attribution but no verification; <xref target="co-signature-trust-boundary"></xref> discusses the resulting trust boundary.</t>
<t>By producing <tt>callerSignature</tt>, a conforming Caller attests that it accepts the canonical record as representing the identified Caller-to-Agent delegation. This attestation does not, by itself, imply that the Caller independently recomputed <tt>taskHash</tt>, observed the result committed to by <tt>resultHash</tt>, or validated <tt>success</tt>, <tt>latencyMs</tt>, or <tt>failureType</tt>, unless a deployment defines and enforces those checks. The Security Considerations discuss the resulting trust boundary.</t>
</section>

<section anchor="failure-type-classification"><name>Failure Type Classification</name>
<t>When <tt>success</tt> is <tt>false</tt>, <tt>failureType</tt> MUST be either one of the following registry values or a deployment-defined extension value:</t>
<table>
<thead>
<tr>
<th>Value</th>
<th>Condition</th>
</tr>
</thead>

<tbody>
<tr>
<td><tt>timeout</tt></td>
<td>The call exceeded a deployment-defined latency bound (default RECOMMENDED: 30000 ms), or the underlying error was timeout-shaped.</td>
</tr>

<tr>
<td><tt>validation</tt></td>
<td>The call failed due to input or output validation (schema, parse, type mismatch).</td>
</tr>

<tr>
<td><tt>error</tt></td>
<td>All other failures. This is the registry catch-all.</td>
</tr>
</tbody>
</table><t>Receiving implementations MUST treat unknown <tt>failureType</tt> values as <tt>error</tt> for the purposes of any deployment-policy decision they make. Deployments defining extension values should note that extensions carry meaning only within that deployment.</t>
<t>When <tt>success</tt> is <tt>true</tt>, <tt>failureType</tt> MUST be the empty string <tt>&quot;&quot;</tt>. This is a deliberate choice over a null value: it keeps the canonical payload's value type stable (always string) so that JCS canonicalization (<xref target="canonical-payload"></xref>) produces a predictable byte sequence regardless of success state. A verifier that substitutes a null value for an empty <tt>failureType</tt> will compute a different canonical payload and will fail to verify legitimate receipts. A <tt>formatVersion</tt> <tt>&quot;1&quot;</tt> receipt with <tt>success</tt> <tt>true</tt> and a non-empty <tt>failureType</tt>, or <tt>success</tt> <tt>false</tt> and an empty <tt>failureType</tt>, is malformed and MUST be rejected (<xref target="verification"></xref>).</t>
</section>

<section anchor="tool-metadata"><name>Tool Metadata (Optional)</name>
<t>A receipt MAY carry a <tt>toolMetadata</tt> object describing class or capability hints about the tool. This document does not standardize the schema of <tt>toolMetadata</tt>. A deployment may use it to convey:</t>

<ul spacing="compact">
<li>A tool class (e.g., advisory, data-retrieval, mutation, settlement).</li>
<li>A settlement layer identifier when the tool executes on-chain transactions.</li>
<li>A verifiability hint indicating whether the tool's outcome is externally anchored.</li>
</ul>
<t><tt>toolMetadata</tt> is NOT part of the canonical payload and is NOT signed. Consumers that wish to trust <tt>toolMetadata</tt> MUST validate it through out-of-band means (e.g., the tool's published manifest, signed separately).</t>
<t>A future revision of this document, or a companion document, MAY standardize a portion of the <tt>toolMetadata</tt> schema if interoperability needs emerge.</t>
</section>

<section anchor="identity-did-requirements"><name>Identity (DID) Requirements</name>
<t>Both <tt>agentDid</tt> and <tt>callerDid</tt> MUST be syntactically valid DIDs per <xref target="DID-CORE"></xref>. This document does not constrain the DID method. Common choices in production include <tt>did:key</tt>, <tt>did:web</tt>, and ledger-anchored methods such as <tt>did:xrpl</tt> or <tt>did:ethr</tt>.</t>
<t>A deployment MAY apply policy based on DID method -- for example, treating ledger-anchored identities differently from cryptographic-only identities. Such policy is out of scope for this document; the wire format treats all DID methods uniformly.</t>
</section>

<section anchor="security-considerations"><name>Security Considerations</name>

<section anchor="privacy"><name>Privacy</name>
<t>Receipts identify inputs and outputs by hash. Implementations MUST NOT include raw inputs, outputs, prompts, user data, secrets, or PII in any signed field. <tt>toolMetadata</tt>, while not part of the canonical payload, also SHOULD NOT contain such data.</t>
<t>Hash construction matters: a deployment that hashes uncanonicalized inputs may leak information through hash collisions or correlation. The preimage profile (<xref target="preimage-profile"></xref>) removes serializer-level divergence; note that hashing does not make low-entropy or guessable values private, since a verifier holding a candidate value can always test it against the hash.</t>
</section>

<section anchor="replay"><name>Replay</name>
<t>A signed receipt is replayable by anyone who possesses it. Receivers SHOULD enforce a freshness window on <tt>timestamp</tt> and SHOULD reject duplicate receipts identified by <tt>(signature)</tt>. Note that signature-level deduplication identifies distinct records, not distinct executions: two executions whose nine field values coincide exactly (same hashes, same millisecond timestamp, same latency) produce the same canonical payload and -- Ed25519 being deterministic -- the same signature, and are therefore indistinguishable at this layer; see <xref target="known-limitations"></xref>. A deployment that needs cross-receipt deduplication MAY additionally store and dedupe by <tt>(agentDid, taskHash, timestamp)</tt>.</t>
</section>

<section anchor="caller-side-forgery"><name>Caller-Side Forgery</name>
<t>A receipt with only <tt>signature</tt> (Executor) and no <tt>callerSignature</tt> represents the Executor's claim alone. A malicious Executor could fabricate such receipts. Co-signature by the Caller prevents this: a Caller observing a forged receipt about its own delegations would notice the absence of its <tt>callerSignature</tt> and could repudiate.</t>
<t>When <tt>callerSignature</tt> is missing, a deployment SHOULD weight the receipt accordingly. The exact weighting is policy, but treating co-signed and non-co-signed receipts identically is a security mistake.</t>
</section>

<section anchor="single-observer-dominance"><name>Single-Observer Dominance</name>
<t>If a deployment derives reputation or trust signals from receipts and a single Caller produces most of the receipts about a given tool, that Caller's environment-specific bugs, biases, or hostile behavior propagate directly into the derived signal. This is a deployment-policy concern, not a wire-format concern. Deployments SHOULD record the set of distinct <tt>callerDid</tt> values contributing to any derived statistic so that consumers can reason about observer diversity.</t>
</section>

<section anchor="key-compromise"><name>Key Compromise</name>
<t>A compromised Agent or Caller key allows arbitrary receipt forgery for the lifetime of that key. DID methods that support key rotation SHOULD rotate routinely. Verifiers MUST resolve DIDs to the current key set at verification time, not at receipt emission time.</t>
</section>

<section anchor="timestamp-trust"><name>Timestamp Trust</name>
<t><tt>timestamp</tt> is asserted by the Executor and is not independently anchored by this format. A deployment that requires verifiable time SHOULD pair receipts with an external time-anchoring mechanism (<xref target="RFC3161"></xref>, blockchain inclusion, etc.).</t>
</section>

<section anchor="co-signature-trust-boundary"><name>Co-signature Trust Boundary</name>
<t>A caller co-signature adds a second attributable assertion over the same canonical record. It represents evidence from an operationally independent source only to the extent that the Caller's signing authority is operationally separated from the Executor's authority. If the Caller and Executor share a process, operator, or key-management boundary, two signatures do not establish two independent observers.</t>
<t>Co-signature does not prevent collusion. A Caller and Executor acting together can produce a co-signed receipt that describes an execution inaccurately or describes an execution that did not occur. Co-signature binds both identified parties to the same canonical record; it does not establish that the record is true.</t>
</section>
</section>

<section anchor="known-limitations"><name>Known Limitations</name>
<t>The following are known limitations of this format. They are recorded so that implementers and reviewers can account for them; a future revision or companion document may address them.</t>

<ul spacing="compact">
<li>Key rotation and historical verification. A receipt carries no verification-method identifier. When a DID document exposes multiple applicable verification methods, a verifier may need to try multiple candidate keys, and the receipt provides no stable reference to the key used at issuance. If that key is later removed from the DID document, a legitimately issued receipt can become unverifiable. A deployment that needs long-lived verifiability must retain historical key material or pin keys out of band. Carrying an explicit key identifier in the receipt is a candidate for a future revision.</li>
<li>Tool identity. Only the opaque <tt>toolName</tt> is part of the signed payload; <tt>toolMetadata</tt> is not signed. Two implementations, providers, or versions that share a <tt>toolName</tt> are not distinguishable from the receipt alone. Binding a stable tool identifier, a provider or endpoint identity, or a version or manifest digest into the signed payload is a candidate for a future revision.</li>
<li>Receipt uniqueness. The format carries no nonce or per-receipt identifier. Two executions whose field values coincide exactly are represented by byte-identical receipts (<xref target="replay"></xref>) and cannot be distinguished or counted separately. Deployments for which double-counting or under-counting such coincidences matters must add an out-of-band identifier; a nonce field is a candidate for a future revision.</li>
<li>Cross-profile hash comparison. <xref target="preimage-profile"></xref> pins one preimage profile for <tt>formatVersion</tt> <tt>&quot;1&quot;</tt>, which makes hashes comparable across producers within this format. Comparing hashes against records produced under a different profile (including legacy truncated hashes, and formats that hash JSON-string serializations) remains undefined and requires out-of-band agreement.</li>
</ul>
</section>

<section anchor="iana-considerations"><name>IANA Considerations</name>
<t>This document has no IANA actions in its current form. A future revision may register a media type (e.g., <tt>application/xaip-receipt+json</tt>) and a failureType registry.</t>
</section>

</middle>

<back>
<references><name>References</name>
<references><name>Normative References</name>
<reference anchor="DID-CORE" target="https://www.w3.org/TR/did-core/">
  <front>
    <title>Decentralized Identifiers (DIDs) v1.0</title>
    <author fullname="Manu Sporny" initials="M." surname="Sporny">
      <organization>W3C</organization>
    </author>
    <date year="2022" month="July"></date>
  </front>
  <seriesInfo name="W3C" value="Recommendation"></seriesInfo>
</reference>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6234.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8032.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8785.xml"/>
</references>
<references><name>Informative References</name>
<xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3161.xml"/>
<reference anchor="XAIP-CASE-2026-05" target="https://github.com/xkumakichi/xaip-protocol/blob/main/docs/case-study/single-caller-dominance.md">
  <front>
    <title>XAIP Single-Caller Dominance Case Study</title>
    <author>
      <organization>xkumakichi</organization>
    </author>
    <date year="2026" month="May"></date>
  </front>
</reference>
<reference anchor="XAIP-IMPL" target="https://github.com/xkumakichi/xaip-protocol">
  <front>
    <title>XAIP Protocol Reference Implementation</title>
    <author>
      <organization>xkumakichi</organization>
    </author>
    <date year="2026"></date>
  </front>
</reference>
<reference anchor="XAIP-VECTORS" target="https://github.com/xkumakichi/xaip-protocol/tree/main/docs/spec/test-vectors">
  <front>
    <title>XAIP Receipts Conformance Test Vectors (formatVersion 1)</title>
    <author>
      <organization>xkumakichi</organization>
    </author>
    <date year="2026" month="July"></date>
  </front>
</reference>
</references>
</references>

<section anchor="relationship-to-the-xaip-reference-implementation"><name>Relationship to the XAIP Reference Implementation</name>
<t>The XAIP reference implementation <xref target="XAIP-IMPL"></xref> wraps this wire format with an aggregator, a Bayesian trust score, a class-aware risk-flag evaluator, and a decision engine that ranks candidate tools. None of those components are required to produce or consume receipts conformant to this document. A consumer that only wants to verify and store receipts does not need to import any of them.</t>
<t>A consumer that wants a turnkey aggregator and scoring layer may use the reference implementation. A consumer that disagrees with any of those design choices is free to substitute its own implementation while remaining interoperable at the receipt-format layer.</t>
<t>As of this revision, the reference implementation (SDK middleware, four client emitters, and the aggregation service) produces and fail-closed-validates <tt>formatVersion</tt> <tt>&quot;1&quot;</tt> receipts, and its behavior is locked to the conformance vectors of <xref target="test-vectors"></xref>.</t>
<t>The single-observer dominance failure mode discussed earlier in this document was first surfaced in the public dataset of that reference implementation <xref target="XAIP-CASE-2026-05"></xref>.</t>
</section>

<section anchor="adoption-path-for-agent-payment-protocols"><name>Adoption Path for Agent-Payment Protocols</name>
<t>This format is intended to be useful to agent-payment protocols (for example, agent-to-agent payment protocols, agent-mediated commerce protocols, and agent escrow systems) that need a &quot;trust precondition&quot; check before committing to a transaction. Such a protocol can:</t>

<ol spacing="compact">
<li>Require that an Agent present a set of recent receipts before being allowed to initiate a payment.</li>
<li>Define its own scoring policy over the receipt set, or consult an external scoring service.</li>
<li>Require that receipts above a certain transaction value include <tt>callerSignature</tt> (co-signed).</li>
<li>Require that receipts for <tt>settlement</tt>-class tools (declared via <tt>toolMetadata</tt>, which is unsigned and therefore validated out of band per <xref target="tool-metadata"></xref>) be additionally anchored to an external ledger.</li>
</ol>
<t>Each of those is a policy decision local to the agent-payment protocol. This document only defines the receipt wire format; it does not define a payment mechanism, a settlement rail, or any value-transfer system.</t>
</section>

<section anchor="relationship-to-adjacent-receipt-and-audit-formats"><name>Relationship to Adjacent Receipt and Audit Formats</name>
<t>Other efforts also record evidence about AI agent actions. They differ from this document mainly in two dimensions: the unit of evidence (a single call versus a whole session) and which party attests to the record. This appendix is informative and describes that adjacent work for context; this document does not depend on any of these formats.</t>
<t>Session-level integrity bundles record an entire agent session as a single hash-chained, signed archive, answering post-hoc audit questions about a completed workflow. A per-call receipt as defined here and a session-level bundle are composable: a bundle may embed or reference per-call receipts, and a per-call receipt may carry a reference to the session in which it occurred.</t>
<t>Receiver-attested receipts are signed by the service that received the call and produced the result; obtaining such an attestation requires the invoked service to participate in the receipt protocol. The caller co-signature model in this document (the SigningDelegate pattern, <xref target="signingdelegate-pattern"></xref>) instead lets the party that delegated the work sign the same canonical per-call record as the executing Agent, so that -- when both signatures are present -- the record is attributable to both parties to the Caller-to-Agent delegation without requiring the invoked Tool or service to sign. A further pattern places the signature on a mediator at the communication boundary between the Agent and the tool. These approaches represent different observation and control boundaries: a caller co-signature, in particular, does not imply that the Caller independently observed every field, and its evidentiary value depends in part on the operational separation between the Caller and the Agent.</t>
<t>In that space, this document occupies the per-call point at which the executing Agent and the delegating Caller can co-attest the same record. It does not subsume executor-only assertion, receiver attestation, mediator attestation, or session-level bundles; a single deployment may combine more than one of them.</t>
</section>

<section anchor="test-vectors"><name>Conformance Test Vectors</name>
<t>Executable test vectors for this revision are published in the reference repository <xref target="XAIP-VECTORS"></xref>:</t>

<ul spacing="compact">
<li><tt>docs/spec/test-vectors/receipts-v1-vectors.json</tt> -- the vectors. Every hash, canonical payload byte string, and Ed25519 signature is real; test keys are embedded (and marked as such).</li>
<li><tt>docs/spec/test-vectors/check.mjs</tt> -- a dependency-free checker (Node.js 18 or later, standard library only) that re-derives every value from the vectors file and verifies all signatures, failing loudly on any drift.</li>
</ul>
<t>The vectors pin, as executable bytes: the preimage profile of <xref target="preimage-profile"></xref> (raw-UTF-8 text hashing with the JSON-form value shown as the explicit wrong answer; JCS member-order invariance; the empty-input sentinel); canonical payload construction with and without <tt>formatVersion</tt>, including the byte-identity of payloads with and without <tt>toolMetadata</tt>; a complete co-signed receipt (the example in <xref target="receipt-structure"></xref>), a failure receipt using the sentinel, and a legacy receipt; a tamper case whose signatures must fail; and the fail-closed rejections of <xref target="verification"></xref> (truncated hash, uppercase hash, <tt>failureType</tt> inconsistent with <tt>success</tt>).</t>
<t>The vectors are the executable form of this document's processing rules: an implementation that reproduces them byte-for-byte conforms to the hashing and canonicalization layers of this revision, without needing to interpret prose. The example in <xref target="receipt-structure"></xref> and the vectors are generated from the same values, so the example cannot silently drift from the specification.</t>
</section>

<section anchor="change-log"><name>Change Log</name>

<ul spacing="compact">
<li><tt>-03</tt> (2026-07-02): Versioned the wire format and made it mechanically checkable. Added the <tt>formatVersion</tt> member (value <tt>&quot;1&quot;</tt>), carried inside the signed payload, and defined legacy-receipt handling. Pinned the hash preimage profile -- scalar content hashes its content bytes, structured JSON hashes its JCS form, absent values hash the empty-input sentinel -- narrowing the -02 known limitation on preimage encoding. Made verification of <tt>formatVersion</tt> <tt>&quot;1&quot;</tt> receipts fail-closed for hash format and <tt>failureType</tt> consistency. Pinned: no Unicode normalization at verification, NFKC/NFKD prohibited at any layer, NFC recommended for producers; signature encoding exactly 128 lowercase hex characters; <tt>latencyMs</tt> range [0, 2^53 - 1]; a single RECOMMENDED RFC 3339 rendering for <tt>timestamp</tt>; unknown top-level members are unauthenticated. Resolved the -02 contradiction between the closed <tt>failureType</tt> value set and deployment extensions, and required rejection of <tt>success</tt>/<tt>failureType</tt>-inconsistent receipts. Required the empty-input sentinel for failure receipts committing to no output. Added a SHOULD-level minimum check list for Callers before co-signing. Corrected the replay-section uniqueness claim and added the receipt-uniqueness (no nonce) known limitation; removed the format-versioning known limitation (resolved by <tt>formatVersion</tt>). Replaced the example receipt -- whose hash values did not conform to this document's own length requirement in -00 through -02 -- with a complete receipt that verifies against published test keys. Added executable conformance test vectors and a dependency-free checker. No changes to the signing algorithm, canonicalization scheme, or trust-boundary language of -02.</li>
<li><tt>-02</tt> (2026-06-19): Scoped to define precisely what a caller co-signature establishes and what it does not. The Abstract no longer states that co-signature prevents unilateral fabrication outright; it now states that two valid signatures bind the identified Caller and Agent to the same canonical record, and that this does not establish independent observation of every field, execution correctness, or absence of collusion. Clarified caller-signature semantics, separated attribution from operational independence in the Security Considerations, scoped the adjacent-formats appendix to the Caller-to-Agent delegation, and named the executor / delegator / receiver / mediator / session-bundle attestation axis. Softened the privacy design principle to a content-minimizing claim. Normative clarification: <tt>taskHash</tt> and <tt>resultHash</tt> MUST be computed with SHA-256 and encoded as exactly 64 lowercase hexadecimal characters; this narrows the -00/-01 wording, which only RECOMMENDED SHA-256. Added a Known Limitations section. No wire fields were added or changed.</li>
<li><tt>-01</tt> (2026-06-19): Foregrounded caller co-signature (mutual attestation) as a distinguishing property of the format in the Abstract, and added an informative appendix relating per-call, caller-co-signed receipts to session-level integrity bundles and receiver-attested receipts. No wire-format or normative changes; receipts valid under -00 remain valid under -01.</li>
<li><tt>-00</tt> (2026-05-22): Initial individual draft. Split out from the XAIP reference implementation specification, focused on the receipt wire format only. Removed aggregator, scoring, and decision-engine content; left those to deployment policy.</li>
</ul>
</section>

</back>

</rfc>
