| Internet-Draft | ai-txt | June 2026 |
| Cardillo | Expires 14 December 2026 | [Page] |
This document requests registration of two Well-Known URIs under the "/.well-known/" path: "ai.txt" and "ai.json". These URIs define a structured, machine-readable file in which a site operator can declare AI usage preferences (training, scraping, indexing, caching), licensing terms, required attribution, and per-agent rules.¶
"ai.txt" is positioned as a structured attachment surface for AI usage preferences in addition to robots.txt and HTTP-header carriage proposed by the IETF AIPREF working group. As the AIPREF vocabulary stabilizes, "ai.txt" can carry those preferences in a typed, single-file form alongside the broader licensing, attribution, and policy declarations defined in this document.¶
This format is complementary to "robots.txt" [ROBOTS]. Where "robots.txt" can block crawling entirely, "ai.txt" expresses nuanced policies such as "you may crawl but not train on this content" -- a distinction that "robots.txt" alone cannot express.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 14 December 2026.¶
Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
AI systems increasingly interact with website content in ways that go beyond traditional crawling: training language models on web content, indexing content for retrieval-augmented generation, caching content for future reference, and scraping data for analysis. Website operators currently have no standard, machine-readable mechanism to communicate their policies regarding these AI-specific uses.¶
"robots.txt" [ROBOTS] can block crawling entirely, but it cannot express nuanced policies. A newspaper may wish to allow crawling (for search indexing) while prohibiting training (for model development). A blog may wish to allow training under a specific license. A corporation may wish to allow some AI agents while blocking others.¶
"ai.txt" addresses this gap. It is a policy declaration file, served at a well-known location, that communicates to AI systems:¶
Whether content may be used for AI model training¶
Whether content may be scraped, indexed, or cached¶
Under what license terms AI training is permitted¶
Which AI agents are permitted and under what conditions¶
What attribution and disclosure requirements apply¶
What compliance and audit expectations exist¶
"ai.txt" is complementary to, and does not replace, existing standards:¶
Declares crawling restrictions. "ai.txt" adds training, licensing, and per-agent policy declarations that "robots.txt" cannot express. Both files may coexist.¶
Declares AI agent capabilities (endpoints, protocols, auth). "ai.txt" declares policy. A site may use both: "agents.txt" to declare what agents can DO, and "ai.txt" to declare what is ALLOWED.¶
Declares security vulnerability disclosure contacts. Similar well-known file pattern; different domain.¶
The IETF AIPREF working group is developing a vocabulary [AIPREF-VOCAB] for expressing AI usage preferences and an attachment specification [AIPREF-ATTACH] for carrying those preferences via robots.txt directives and HTTP response headers.¶
"ai.txt" complements that work; it does not replace it. AIPREF defines the vocabulary (the set of preference terms and their semantics) and two carriage mechanisms (robots.txt and HTTP headers). "ai.txt" is a third carriage mechanism -- a single, structured, typed file -- that provides three properties not addressed by robots.txt attachment or per-response headers:¶
Carriage of preferences for an entire site, independent of any individual response or robots.txt path block.¶
A single audit surface -- one file at one URL -- that can be fetched once and cached for site-wide preference resolution.¶
A place to declare preferences alongside related declarations (licensing, attribution, per-agent rate limits) that fall outside AIPREF's scope.¶
When the AIPREF vocabulary stabilizes, "ai.txt" implementations SHOULD use AIPREF preference names where they apply. Implementations SHOULD treat the preferences carried in "ai.txt" as equivalent in authority to the same preferences carried via the AIPREF robots.txt or HTTP-header mechanisms. Where multiple carriers disagree for the same site and resource, conflict resolution is out of scope for this document and may be addressed by future AIPREF output.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
The "ai.txt" file MUST be served at:¶
https://example.com/.well-known/ai.txt¶
The file MUST be served over HTTPS in production deployments. HTTP is permitted only in development or testing environments.¶
The file MUST be served with Content-Type "text/plain; charset=utf-8".¶
The "ai.txt" file uses a block-based key-value format inspired by "robots.txt". Each line contains a key, a colon, and a value. Lines beginning with "#" are comments. Indented lines (two or more spaces, or one or more tabs) belong to the preceding block.¶
A minimal "ai.txt" file:¶
# ai.txt Spec-Version: 1.0 Site-Name: My Blog Site-URL: https://myblog.com Training: deny¶
Human-readable name of the site or service.¶
Canonical HTTPS URL of the site.¶
Version of the "ai.txt" specification the file conforms to (e.g., "1.0"). This is a regular field, not a comment.¶
ISO 8601 timestamp of when the file was generated. This is a regular field, not a comment.¶
Brief description of the site.¶
Contact email for AI policy inquiries.¶
URL to a human-readable AI policy page.¶
These fields declare site-wide defaults. Each accepts "allow" or "deny". The value "conditional" is valid only for the Training field, where it activates the per-path rules defined in the Training Path Fields section; implementations encountering "conditional" on any other field SHOULD treat it as "deny".¶
Whether AI systems may use content for model training.¶
Whether AI agents may scrape or read content.¶
Whether AI systems may index content for retrieval.¶
Whether AI systems may cache content.¶
When Training is "conditional", these fields specify per-path rules:¶
Glob pattern for paths where training is permitted.¶
Glob pattern for paths where training is denied.¶
Multiple Training-Allow and Training-Deny lines MAY appear. More specific patterns take precedence.¶
Agent blocks declare per-agent policy overrides. The wildcard "*" sets the default for all agents.¶
Agent: * Rate-Limit: 60/minute Agent: ClaudeBot Training: allow Rate-Limit: 200/minute Agent: GPTBot Training: deny Scraping: deny¶
Agent identifiers SHOULD match the first token of the agent's User-Agent header (case-insensitive).¶
Fields within an Agent block:¶
The JSON companion file MUST be served at:¶
https://example.com/.well-known/ai.json¶
The file MUST be served with Content-Type "application/json; charset=utf-8".¶
The JSON format contains equivalent information to "ai.txt" in a typed JSON structure suitable for direct consumption by programmatic clients. The "ai.txt" file MAY reference the JSON file via:¶
AI-JSON: https://example.com/.well-known/ai.json¶
A minimal "ai.json" document:¶
{
"specVersion": "1.0",
"site": {
"name": "My Blog",
"url": "https://myblog.com"
},
"policies": {
"training": "deny",
"scraping": "allow",
"indexing": "allow",
"caching": "allow"
},
"agents": {
"*": {}
}
}
¶
Field semantics are identical to those defined in Section 2 for the text format, with one structural difference: in the JSON form, the "specVersion" member, the "policies" member (with all four of its "training", "scraping", "indexing", and "caching" members), and the "agents" member are REQUIRED. Defaults that the text format applies implicitly MUST be stated explicitly in JSON documents.¶
AI agents and crawlers SHOULD fetch "/.well-known/ai.txt" and/or "/.well-known/ai.json" before interacting with an unfamiliar site.¶
Agents SHOULD prefer the JSON format when both are available.¶
Agents SHOULD cache the policy for the duration declared by the HTTP Cache-Control header, with a minimum TTL of 60 seconds.¶
"ai.txt" is advisory. It declares the site owner's policy. Compliance is expected in good faith but is not enforced by the file itself.¶
Agents SHOULD respect Training declarations by not using content for model training when Training is "deny".¶
Agents SHOULD respect rate limit declarations.¶
Servers MUST enforce rate limits and access control independently of the declarations in "ai.txt".¶
Policy declarations MUST NOT include actual credentials, tokens, or secrets of any kind.¶
"ai.txt" is advisory; servers MUST enforce policies independently.¶
Agents MUST validate that referenced URLs use HTTPS before following them.¶
Site owners SHOULD review their "ai.txt" periodically to ensure it accurately reflects current policy.¶
This document requests registration of the following Well-Known URI in the "Well-Known URIs" registry established by [RFC8615]:¶
ai.txt¶
Kayla Cardillo¶
This document.¶
Text-format AI policy declaration file. Allows website operators to declare their AI content policy - training permissions, licensing terms, per-agent rules, and compliance requirements.¶
This document requests registration of the following Well-Known URI in the "Well-Known URIs" registry established by [RFC8615]:¶
# ai.txt - AI Policy Declaration Spec-Version: 1.0 Site-Name: News Daily Site-URL: https://newsdaily.com Contact: ai@newsdaily.com Policy-URL: https://newsdaily.com/ai-policy Training: conditional Scraping: allow Indexing: allow Caching: allow Training-Allow: /articles/free/* Training-Deny: /articles/premium/* Training-License: CC-BY-4.0 Training-Fee: https://newsdaily.com/ai-licensing Agent: * Rate-Limit: 30/minute Agent: ClaudeBot Training: allow Rate-Limit: 120/minute Agent: GPTBot Training: deny Attribution: required AI-Disclosure: required¶
The "ai.txt" format draws on the design of "robots.txt" [ROBOTS] and "security.txt" [RFC9116] for structural inspiration. The SPDX license identifiers referenced in Training-License are maintained by the Linux Foundation [SPDX].¶