Internet-Draft F. Batum Intended status: Standards Track April 6, 2026 Expires: October 8, 2026 AI Discovery and Retrieval Endpoint (AIDRE) draft-batum-aidre-00 Abstract This document specifies the AI Discovery and Retrieval Endpoint (AIDRE), a protocol for publishing machine-oriented, canonical, and semantically retrievable content on the web. AIDRE defines a discovery document, collection metadata, retrieval interfaces, optional vector-native query support, and content representation rules for AI systems. AIDRE aims to reduce redundant crawling, parsing, tokenization, and embedding of the same origin content while improving freshness, provenance, and interoperability for AI systems. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on October 8, 2026. Copyright Notice Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Table of Contents 1. Introduction 2. Conventions and Terminology 3. Discovery 4. Collections 5. Query Model 6. Search Endpoint 7. Results 8. Chunk Retrieval 9. Caching and Freshness 10. Error Handling 11. Security Considerations 12. Privacy Considerations 13. IANA Considerations 14. Media Types 15. HTTP Usage 16. Request and Response Envelope 17. Error Model 18. Pagination and Limits 19. Embedding Compatibility Rules 20. Representation Negotiation 21. Rate Limiting 22. Schema and OpenAPI Conformance 23. Versioning and Extensibility 24. HTTP Exchange Examples 25. ABNF Summary 26. References 27. Additional Design Considerations Author's Address 1. Introduction Modern AI retrieval systems repeatedly process the same web content by crawling, extracting, chunking, and embedding text. This results in redundant computation, increased costs, and inconsistent interpretations. AIDRE introduces a standardized mechanism for origins to expose canonical, machine-oriented retrieval interfaces, enabling direct semantic access to content. 2. Conventions and Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. 2.1. Terms Origin: The web origin publishing AIDRE metadata. Discovery Document: A JSON document describing endpoints and capabilities. Collection: A logical grouping of content. Chunk: A retrievable unit of canonical content. Embedding Space: A defined vector representation space. 3. Discovery 3.1. Well-Known URI An origin implementing AIDRE MUST expose the discovery document at /.well-known/ai-discovery over HTTPS using a well-known URI as defined in [RFC8615]. 3.2. Media Type The discovery document MUST be served as application/json. 3.3. DNS Considerations Clients MUST NOT require DNS SRV lookup for discovery. DNS HTTPS (SVCB) records MAY be used as optimization hints, as described in [RFC9460]. DNS SRV records are NOT RECOMMENDED for public interoperability. 3.4. Example { "version": "1", "service": "AIDRE", "endpoints": { "search": "https://ai.example.com/search", "collections": "https://ai.example.com/collections", "chunk": "https://ai.example.com/chunks/{id}" }, "capabilities": { "query_text": true, "query_vector": true } } 4. Collections Collections group retrievable content. Each collection SHOULD define: * name * description * visibility * updated_at 5. Query Model 5.1. General Rule A request MUST include exactly one of: * query * query_vector 5.2. Text Query Servers MUST support text queries. 5.3. Vector Query Servers SHOULD support vector queries. If used, clients MUST specify embedding_space. 6. Search Endpoint 6.1. Method Clients MUST use HTTP POST. 6.2. Request Example { "query": "sso setup", "collection": "docs", "top_k": 5 } 6.3. Vector Example { "query_vector": [0.01, -0.02], "embedding_space": "example-space", "collection": "docs" } 7. Results 7.1. Structure Responses MUST include a results array. 7.2. Example { "results": [ { "id": "doc_1#chunk_1", "score": 0.91, "metadata": { "updated_at": "2026-04-01T00:00:00Z", "canonical": true } } ] } 7.3. Optional Fields Servers MAY return: * text * semantic payload * vectors Servers SHOULD NOT return vectors by default. 8. Chunk Retrieval Servers SHOULD support: GET /chunks/{id} 9. Caching and Freshness Servers SHOULD expose: * content_hash * updated_at * etag 10. Error Handling Errors MUST be JSON objects. Example: { "error": "unsupported_embedding_space", "message": "Embedding space not supported" } 11. Security Considerations AIDRE can make machine-oriented retrieval of content more efficient. That efficiency introduces risks, including accelerated scraping, exfiltration of structured content, abuse of vector-return paths, stale or malicious publication, and misleading provenance. 11.1. Retrieval Amplification AIDRE can reduce the cost of large-scale content extraction for both legitimate and illegitimate clients. Deployments SHOULD consider rate limiting, authentication, authorization, abuse detection, and staged disclosure of higher-value representations. 11.2. Vector Disclosure and Leakage Returning vectors can create additional disclosure risk beyond returning text alone. Depending on the embedding model, vectors MAY expose distributional or structural properties of source content and MAY enable downstream correlation, approximate membership inference, or other forms of analysis not intended by the publisher. Accordingly: * Servers SHOULD NOT return vectors by default. * Servers SHOULD evaluate whether vector return is necessary for a given deployment. * Sensitive or access-controlled collections SHOULD disable vector return unless there is a specific operational need. * Deployments MAY use distinct policies for public and authenticated collections. This specification does not require or assume that vector inversion is practical in all settings, but deployments SHOULD treat vector disclosure as a potentially sensitive operation. 11.3. Malicious or Stale Canonical Content AIDRE gives publishers a machine-oriented canonical surface. If that surface is compromised, stale, poisoned, or maliciously altered, downstream systems MAY ingest or trust incorrect content at scale. Deployments SHOULD therefore consider content review workflows, publication approvals, deprecation semantics, rollback procedures, and freshness validation. 11.4. Signed Provenance Deployments MAY attach signed provenance metadata to responses, collections, or chunk resources. Signed provenance can help clients verify origin authenticity, content integrity, and publication scope. If signed provenance is used, deployments SHOULD define: * what is signed, * which keys are authoritative, * how keys are discovered, * signature lifetime and rotation policy, * the failure behavior when signature validation does not succeed. A deployment using signed provenance SHOULD ensure that key discovery is bound to the publisher's trust model, for example via HTTPS on the publisher origin or another integrity-protected mechanism. 11.5. Query Abuse Query inputs, including query vectors, can be adversarial. Servers SHOULD validate input size, dimensionality, and request shape, and SHOULD protect retrieval infrastructure against resource exhaustion. 11.6. Access-Controlled Collections AIDRE does not imply that all collections are public. Private or tenant-scoped collections MUST be protected by appropriate authorization controls. Discovery documents SHOULD avoid revealing unnecessary detail about protected collections. 12. Privacy Considerations Sensitive data SHOULD NOT be exposed unintentionally. 13. IANA Considerations This document does not itself register a new media type. However, this specification defines and uses the label application/aidre+json as the preferred representation for AIDRE request and response bodies. If AIDRE proceeds toward broader standardization and deployment, a future revision of this document or a companion specification SHOULD request IANA registration for that media type in accordance with the procedures applicable at that time. Until such registration occurs, implementations SHOULD treat application/aidre+json as a provisional media type for experimental and pre-standard deployment. 14. Media Types 14.1. AIDRE JSON Media Type AIDRE request and response bodies SHOULD use the media type application/aidre+json. Clients MUST send a Content-Type header of application/aidre+json when sending AIDRE request bodies. Clients SHOULD send an Accept header of application/aidre+json. Servers MAY accept application/json for compatibility, but a conformant implementation SHOULD prefer application/aidre+json. 14.2. Character Encoding AIDRE JSON payloads MUST use UTF-8 encoding. 15. HTTP Usage 15.1. Methods The discovery and collection resources MUST support HTTP GET. The search resource MUST support HTTP POST. The chunk dereference resource MUST support HTTP GET. 15.2. Request Headers Clients SHOULD send: * Accept: application/aidre+json Clients sending request bodies MUST send: * Content-Type: application/aidre+json 15.3. Response Headers Servers SHOULD include a Content-Type response header set to application/aidre+json for successful AIDRE responses. Servers MAY include ETag, Last-Modified, Cache-Control, and RateLimit fields where appropriate. 15.4. Status Codes Servers SHOULD use the following HTTP status codes consistently: * 200 OK: successful request. * 201 Created: resource created, if an extension defines creation. * 400 Bad Request: syntactically invalid request. * 401 Unauthorized: authentication required or failed. * 403 Forbidden: authenticated but not permitted. * 404 Not Found: resource identifier not found. * 409 Conflict: request conflicts with resource state. * 415 Unsupported Media Type: unsupported Content-Type. * 422 Unprocessable Entity: semantically invalid request, such as unsupported embedding dimensionality. * 429 Too Many Requests: rate limit exceeded. * 500 Internal Server Error: unexpected server error. * 503 Service Unavailable: temporary overload or maintenance. 16. Request and Response Envelope 16.1. General Response Shape A successful search response MUST be a JSON object containing a results member whose value is an array. A successful response SHOULD also contain: * request_id * collection * meta 16.2. Response Envelope Example { "request_id": "req_7f4c1c", "collection": "docs", "results": [ { "id": "doc_123#chunk_7", "score": 0.92, "metadata": { "updated_at": "2026-04-01T10:00:00Z", "canonical": true, "visibility": "public" } } ], "meta": { "returned": 1, "top_k": 3 } } 16.3. Request Identifiers Servers SHOULD generate a request identifier for each request. A request identifier SHOULD be unique within operationally relevant scope and SHOULD be suitable for debugging and audit correlation. 17. Error Model 17.1. Error Envelope Errors MUST be JSON objects. Error responses SHOULD contain: * error * message * details * request_id 17.2. Error Example { "error": "unsupported_embedding_space", "message": "The requested embedding space is not supported.", "details": { "embedding_space": "example:unsupported-space" }, "request_id": "req_7f4c1c" } 17.3. Error Codes Servers SHOULD use stable, machine-readable error values. Suggested values include: * invalid_request * unsupported_embedding_space * invalid_embedding_dimension * unsupported_return_field * unauthorized * forbidden * not_found * rate_limited * internal_error 18. Pagination and Limits 18.1. top_k Semantics The top_k member requests the maximum number of results returned. If top_k is omitted, the server MUST apply a documented default. Servers MUST reject non-positive top_k values. Servers MAY enforce an implementation-defined maximum top_k. If the client requests a top_k larger than the server maximum, the server MUST either reject the request with an error or clamp the value according to documented behavior. 18.2. Pagination Search endpoints MAY support pagination. If pagination is supported, the server SHOULD use an opaque cursor scheme. The client MUST treat cursor values as opaque. 18.3. Pagination Envelope Example { "request_id": "req_9aa1", "collection": "docs", "results": [], "meta": { "returned": 0, "top_k": 10, "next_cursor": "eyJvZmZzZXQiOjEwMH0" } } 19. Embedding Compatibility Rules 19.1. Embedding Space Identification Each embedding space advertised by the server MUST include: * id * dimensions * distance It SHOULD also include: * normalized * provider * model * revision 19.2. Dimensionality If a query_vector length does not match the dimensions declared for the requested embedding space, the server MUST reject the request with 422 Unprocessable Entity. 19.3. Distance Function The server MUST evaluate query vectors according to the distance function declared by the embedding space. Clients MUST NOT assume cosine similarity unless the embedding space declaration says so. 19.4. Normalization If an embedding space declares normalized=true, clients SHOULD send normalized vectors. A server MAY normalize vectors on receipt, but such behavior SHOULD be documented. 19.5. Compatibility Failure Example { "error": "invalid_embedding_dimension", "message": "The supplied query vector has length 1024, but the embedding space requires 3072.", "details": { "embedding_space": "openai:text-embedding-3-large:3072:cosine:v1", "expected_dimensions": 3072, "actual_dimensions": 1024 } } 20. Representation Negotiation 20.1. Return Object The return object requests optional fields in result members. Supported fields MAY include: * ids * metadata * text * semantic_payload * vectors 20.2. Server Behavior If a client requests an unsupported return field, the server MUST either: * ignore the field, or * reject the request with a machine-readable error. The chosen behavior MUST be documented by the deployment. 20.3. Unknown Fields Clients MUST ignore unknown response fields. Servers SHOULD ignore unknown request fields unless doing so would create ambiguous or unsafe behavior. 21. Rate Limiting 21.1. General Servers MAY apply rate limiting. If rate limiting is applied, servers SHOULD expose limit state using HTTP rate limit fields defined in [RFC9333]. 21.2. Suggested Fields Deployments SHOULD consider exposing: * RateLimit-Limit * RateLimit-Remaining * RateLimit-Reset 21.3. Related Specifications Deployments are encouraged to align with existing HTTP rate limit field specifications rather than inventing deployment-specific headers when interoperable fields are sufficient. 22. Schema and OpenAPI Conformance 22.1. Companion Artifacts This specification MAY be accompanied by JSON Schema, OpenAPI, or similar machine-readable descriptions of AIDRE messages and resources. 22.2. Normative Authority Such artifacts are useful for tooling, validation, code generation, testing, and documentation. However, unless explicitly stated otherwise, this document remains the normative definition of the protocol. If an OpenAPI description, JSON Schema, or other companion artifact conflicts with this document, this document takes precedence. 22.3. Conformance Guidance Deployments SHOULD ensure that published companion artifacts are kept consistent with the protocol version advertised in the discovery document. 23. Versioning and Extensibility 23.1. Discovery Version The discovery document MUST contain a version member. 23.2. Compatibility Rule Backward-compatible additions SHOULD be made by adding new fields. Clients MUST ignore unknown fields. Backward-incompatible changes SHOULD be introduced through a new protocol version. 23.3. Extension Members Deployments MAY define extension members. Extension members SHOULD use names that minimize collision risk. 24. HTTP Exchange Examples 24.1. Discovery Request Example Request: GET /.well-known/ai-discovery HTTP/1.1 Host: example.com Accept: application/aidre+json, application/json Response: HTTP/1.1 200 OK Content-Type: application/aidre+json Cache-Control: max-age=300 ETag: "disc-v1-9f2a" { "version": "1", "service": "AIDRE", "organization": "Example Corp", "endpoints": { "collections": "https://ai.example.com/collections", "search": "https://ai.example.com/search", "chunk": "https://ai.example.com/chunks/{id}" }, "capabilities": { "query_text": true, "query_vector": true, "return_text": true, "return_semantic_payload": true, "return_vectors": false, "delta_sync": false }, "embedding_spaces": [ { "id": "openai:text-embedding-3-large:3072:cosine:v1", "dimensions": 3072, "distance": "cosine", "normalized": true } ], "auth": { "type": "none" } } 24.2. Search Request Example Request: POST /search HTTP/1.1 Host: ai.example.com Accept: application/aidre+json Content-Type: application/aidre+json { "query_vector": [0.013, -0.028, 0.442], "embedding_space": "openai:text-embedding-3-large:3072:cosine:v1", "collection": "docs", "top_k": 3, "return": { "ids": true, "metadata": true, "text": false, "semantic_payload": true, "vectors": false } } Response: HTTP/1.1 200 OK Content-Type: application/aidre+json RateLimit-Limit: 100 RateLimit-Remaining: 99 RateLimit-Reset: 60 { "request_id": "req_7f4c1c", "collection": "docs", "results": [ { "id": "doc_123#chunk_7", "score": 0.92, "source": { "url": "https://example.com/docs/sso/setup", "title": "SSO Setup", "section": "Prerequisites" }, "metadata": { "updated_at": "2026-04-01T10:00:00Z", "canonical": true, "visibility": "public", "content_hash": "sha256:abcd..." }, "semantic_payload": { "type": "application/semantic+json", "claims": [ { "subject": "SAML SSO", "predicate": "requires", "object": "domain verification" } ] } } ], "meta": { "returned": 1, "top_k": 3 } } 24.3. Compatibility Error Example Request: POST /search HTTP/1.1 Host: ai.example.com Accept: application/aidre+json Content-Type: application/aidre+json { "query_vector": [0.1, 0.2], "embedding_space": "openai:text-embedding-3-large:3072:cosine:v1", "collection": "docs" } Response: HTTP/1.1 422 Unprocessable Entity Content-Type: application/aidre+json { "error": "invalid_embedding_dimension", "message": "The supplied query vector has length 2, but the embedding space requires 3072.", "details": { "embedding_space": "openai:text-embedding-3-large:3072:cosine:v1", "expected_dimensions": 3072, "actual_dimensions": 2 }, "request_id": "req_err_11" } 25. ABNF Summary The following ABNF, using the notation from [RFC5234], summarizes the principal AIDRE JSON member names. It is descriptive of member names and presence expectations and is not a complete JSON grammar. query-member = %s"query" query-vector-member = %s"query_vector" embedding-space-member = %s"embedding_space" collection-member = %s"collection" topk-member = %s"top_k" return-member = %s"return" results-member = %s"results" metadata-member = %s"metadata" request-id-member = %s"request_id" next-cursor-member = %s"next_cursor" 26. References 26.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119. [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", RFC 8174. [RFC5234] Crocker, D., and P. Overell, "Augmented BNF for Syntax Specifications", RFC 5234. [RFC8615] Nottingham, M., "Well-Known Uniform Resource Identifiers (URIs)", RFC 8615. 26.2. Informative References [RFC9460] Schwartz, B., "Service Binding and Parameter Specification via the DNS", RFC 9460. [RFC9333] Polli, R. and M. Martinez, "The RateLimit Fields for HTTP", RFC 9333. 27. Additional Design Considerations 27.1. Why Text Query Remains Mandatory Text query support remains mandatory for baseline interoperability. Not all clients and deployments will share an embedding space, and a text path allows a client to interact with an AIDRE deployment even when vector compatibility has not been negotiated. 27.2. Why Vector Return Is Optional Vector-native querying and vector disclosure are distinct concerns. A deployment may wish to support vector-native query semantics while refusing to disclose stored vectors. This separation allows the protocol to reduce redundant text processing without forcing a single disclosure model. Author's Address Fatih Batum Istanbul, Turkiye Email: fatih@batum.gen.tr