AI.TXT: A Declaration File for AI Usage Preferences, Licensing, and Policy

Internet-Draft	ai-txt	June 2026
Cardillo	Expires 14 December 2026	[Page]

Abstract

This document requests registration of two Well-Known URIs under the "/.well-known/" path: "ai.txt" and "ai.json". These URIs define a structured, machine-readable file in which a site operator can declare AI usage preferences (training, scraping, indexing, caching), licensing terms, required attribution, and per-agent rules.¶

"ai.txt" is positioned as a structured attachment surface for AI usage preferences in addition to robots.txt and HTTP-header carriage proposed by the IETF AIPREF working group. As the AIPREF vocabulary stabilizes, "ai.txt" can carry those preferences in a typed, single-file form alongside the broader licensing, attribution, and policy declarations defined in this document.¶

This format is complementary to "robots.txt" [ROBOTS]. Where "robots.txt" can block crawling entirely, "ai.txt" expresses nuanced policies such as "you may crawl but not train on this content" -- a distinction that "robots.txt" alone cannot express.¶

1. Introduction

AI systems increasingly interact with website content in ways that go beyond traditional crawling: training language models on web content, indexing content for retrieval-augmented generation, caching content for future reference, and scraping data for analysis. Website operators currently have no standard, machine-readable mechanism to communicate their policies regarding these AI-specific uses.¶

"robots.txt" [ROBOTS] can block crawling entirely, but it cannot express nuanced policies. A newspaper may wish to allow crawling (for search indexing) while prohibiting training (for model development). A blog may wish to allow training under a specific license. A corporation may wish to allow some AI agents while blocking others.¶

"ai.txt" addresses this gap. It is a policy declaration file, served at a well-known location, that communicates to AI systems:¶

Whether content may be used for AI model training¶
Whether content may be scraped, indexed, or cached¶
Under what license terms AI training is permitted¶
Which AI agents are permitted and under what conditions¶
What attribution and disclosure requirements apply¶
What compliance and audit expectations exist¶

1.1. Relationship to Existing Standards

"ai.txt" is complementary to, and does not replace, existing standards:¶

robots.txt [ROBOTS]:: Declares crawling restrictions. "ai.txt" adds training, licensing, and per-agent policy declarations that "robots.txt" cannot express. Both files may coexist.¶
agents.txt:: Declares AI agent capabilities (endpoints, protocols, auth). "ai.txt" declares policy. A site may use both: "agents.txt" to declare what agents can DO, and "ai.txt" to declare what is ALLOWED.¶
security.txt [RFC9116]:: Declares security vulnerability disclosure contacts. Similar well-known file pattern; different domain.¶

1.2. Relationship to AIPREF

The IETF AIPREF working group is developing a vocabulary [AIPREF-VOCAB] for expressing AI usage preferences and an attachment specification [AIPREF-ATTACH] for carrying those preferences via robots.txt directives and HTTP response headers.¶

"ai.txt" complements that work; it does not replace it. AIPREF defines the vocabulary (the set of preference terms and their semantics) and two carriage mechanisms (robots.txt and HTTP headers). "ai.txt" is a third carriage mechanism -- a single, structured, typed file -- that provides three properties not addressed by robots.txt attachment or per-response headers:¶

Carriage of preferences for an entire site, independent of any individual response or robots.txt path block.¶
A single audit surface -- one file at one URL -- that can be fetched once and cached for site-wide preference resolution.¶
A place to declare preferences alongside related declarations (licensing, attribution, per-agent rate limits) that fall outside AIPREF's scope.¶

When the AIPREF vocabulary stabilizes, "ai.txt" implementations SHOULD use AIPREF preference names where they apply. Implementations SHOULD treat the preferences carried in "ai.txt" as equivalent in authority to the same preferences carried via the AIPREF robots.txt or HTTP-header mechanisms. Where multiple carriers disagree for the same site and resource, conflict resolution is out of scope for this document and may be addressed by future AIPREF output.¶

1.4. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶

2. The "ai.txt" Well-Known URI

2.1. Location

The "ai.txt" file MUST be served at:¶

https://example.com/.well-known/ai.txt

The file MUST be served over HTTPS in production deployments. HTTP is permitted only in development or testing environments.¶

The file MUST be served with Content-Type "text/plain; charset=utf-8".¶

2.2. Format

The "ai.txt" file uses a block-based key-value format inspired by "robots.txt". Each line contains a key, a colon, and a value. Lines beginning with "#" are comments. Indented lines (two or more spaces, or one or more tabs) belong to the preceding block.¶

A minimal "ai.txt" file:¶

# ai.txt
Spec-Version: 1.0
Site-Name: My Blog
Site-URL: https://myblog.com
Training: deny

2.3. Site Fields

Site-Name (REQUIRED):: Human-readable name of the site or service.¶
Site-URL (REQUIRED):: Canonical HTTPS URL of the site.¶
Spec-Version (OPTIONAL):: Version of the "ai.txt" specification the file conforms to (e.g., "1.0"). This is a regular field, not a comment.¶
Generated-At (OPTIONAL):: ISO 8601 timestamp of when the file was generated. This is a regular field, not a comment.¶
Description (OPTIONAL):: Brief description of the site.¶
Contact (OPTIONAL):: Contact email for AI policy inquiries.¶
Policy-URL (OPTIONAL):: URL to a human-readable AI policy page.¶

2.4. Content Policy Fields

These fields declare site-wide defaults. Each accepts "allow" or "deny". The value "conditional" is valid only for the Training field, where it activates the per-path rules defined in the Training Path Fields section; implementations encountering "conditional" on any other field SHOULD treat it as "deny".¶

Training (OPTIONAL, default "deny"):: Whether AI systems may use content for model training.¶
Scraping (OPTIONAL, default "allow"):: Whether AI agents may scrape or read content.¶
Indexing (OPTIONAL, default "allow"):: Whether AI systems may index content for retrieval.¶
Caching (OPTIONAL, default "allow"):: Whether AI systems may cache content.¶

2.5. Training Path Fields

When Training is "conditional", these fields specify per-path rules:¶

Training-Allow (OPTIONAL):: Glob pattern for paths where training is permitted.¶
Training-Deny (OPTIONAL):: Glob pattern for paths where training is denied.¶

Multiple Training-Allow and Training-Deny lines MAY appear. More specific patterns take precedence.¶

2.6. Licensing Fields

Training-License (OPTIONAL):: SPDX license identifier [SPDX] for AI training use (e.g., "CC-BY-4.0").¶
Training-Fee (OPTIONAL):: URL to commercial licensing or pricing page.¶

2.7. Agent Blocks

Agent blocks declare per-agent policy overrides. The wildcard "*" sets the default for all agents.¶

Agent: *
  Rate-Limit: 60/minute

Agent: ClaudeBot
  Training: allow
  Rate-Limit: 200/minute

Agent: GPTBot
  Training: deny
  Scraping: deny

Agent identifiers SHOULD match the first token of the agent's User-Agent header (case-insensitive).¶

Fields within an Agent block:¶

Training, Scraping, Indexing, Caching: Override site-wide policy¶
Rate-Limit: Advisory rate limit in "N/window" format (second, minute, hour, day)¶

2.8. Content Requirement Fields

Attribution (OPTIONAL):: Whether AI outputs must attribute the source. One of: "required", "recommended", "none".¶
AI-Disclosure (OPTIONAL):: Whether AI-generated content derived from this site must be disclosed as AI-generated. One of: "required", "recommended", "none".¶

2.9. Compliance Fields

Audit (OPTIONAL):: Whether AI agents must provide audit receipts. One of: "required", "optional", "none".¶
Audit-Format (OPTIONAL):: Expected audit format identifier (e.g., "rer-artifact/0.1").¶

3. The "ai.json" Well-Known URI

3.1. Location

The JSON companion file MUST be served at:¶

https://example.com/.well-known/ai.json

The file MUST be served with Content-Type "application/json; charset=utf-8".¶

3.2. Format

The JSON format contains equivalent information to "ai.txt" in a typed JSON structure suitable for direct consumption by programmatic clients. The "ai.txt" file MAY reference the JSON file via:¶

AI-JSON: https://example.com/.well-known/ai.json

A minimal "ai.json" document:¶

{
  "specVersion": "1.0",
  "site": {
    "name": "My Blog",
    "url": "https://myblog.com"
  },
  "policies": {
    "training": "deny",
    "scraping": "allow",
    "indexing": "allow",
    "caching": "allow"
  },
  "agents": {
    "*": {}
  }
}

Field semantics are identical to those defined in Section 2 for the text format, with one structural difference: in the JSON form, the "specVersion" member, the "policies" member (with all four of its "training", "scraping", "indexing", and "caching" members), and the "agents" member are REQUIRED. Defaults that the text format applies implicitly MUST be stated explicitly in JSON documents.¶

6. IANA Considerations

6.1. Well-Known URI Registration: "ai.txt"

This document requests registration of the following Well-Known URI in the "Well-Known URIs" registry established by [RFC8615]:¶

URI suffix:: ai.txt¶
Change controller:: Kayla Cardillo¶
Specification document(s):: This document.¶
Related information:: Text-format AI policy declaration file. Allows website operators to declare their AI content policy - training permissions, licensing terms, per-agent rules, and compliance requirements.¶

6.2. Well-Known URI Registration: "ai.json"

This document requests registration of the following Well-Known URI in the "Well-Known URIs" registry established by [RFC8615]:¶

URI suffix:: ai.json¶
Change controller:: Kayla Cardillo¶
Specification document(s):: This document.¶
Related information:: JSON-format AI policy declaration file. Companion format to ai.txt.¶

7. References

7.1. Normative References

[RFC2119]: Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/rfc/rfc2119>.
[RFC8174]: Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.
[RFC8615]: Nottingham, M., "Well-Known Uniform Resource Identifiers (URIs)", RFC 8615, DOI 10.17487/RFC8615, May 2019, <https://www.rfc-editor.org/rfc/rfc8615>.
[RFC9110]: Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Semantics", STD 97, RFC 9110, DOI 10.17487/RFC9110, June 2022, <https://www.rfc-editor.org/rfc/rfc9110>.

7.2. Informative References

[AIPREF-ATTACH]: "Attaching AI Usage Preferences to Content", Work in Progress, Internet-Draft, draft-ietf-aipref-attach, 2026, <https://datatracker.ietf.org/doc/draft-ietf-aipref-attach/>.
[AIPREF-VOCAB]: "A Vocabulary for Expressing AI Usage Preferences", Work in Progress, Internet-Draft, draft-ietf-aipref-vocab, 2026, <https://datatracker.ietf.org/doc/draft-ietf-aipref-vocab/>.
[CF-CONTENT-SIGNALS]: "Cloudflare Content Signals Policy", 2025, <https://blog.cloudflare.com/content-signals-policy/>.
[RFC9116]: Foudil, E. and Y. Shafranovich, "A File Format to Aid in Security Vulnerability Disclosure", April 2022.
[ROBOTS]: "Robots Exclusion Protocol", September 2022, <https://www.rfc-editor.org/rfc/rfc9309>.
[SPAWNING-AITXT]: "ai.txt -- Generate ai.txt files for your website", 2023, <https://site.spawning.ai/spawning-ai-txt>.
[SPDX]: "SPDX License List", 2024, <https://spdx.org/licenses/>.
[TDMREP]: "TDM Reservation Protocol", 2022, <https://www.w3.org/community/tdmrep/>.