Design Considerations and Profile for HTTP APIs Consumed by AI Agents

Design Considerations and Profile for HTTP APIs Consumed by AI Agents Independent

Bengaluru India gaikwad.madhav@gmail.com

General AI agents API design tool calling Model Context Protocol OpenAPI HTTP AI agents are a fast-growing kind of HTTP API client. A human developer reads an API's documentation once and then writes code that calls it. An AI agent works from the machine-readable description each time it plans a step. It works within a limited amount of text it can hold at once. It retries often. It picks operations by matching their descriptions to a goal. An API built mainly for human developers can lead an agent into failures that are easy to avoid, such as repeating a write, running out of room for the task, or getting stuck on an error it cannot recover from. This document lists the properties that make an HTTP API easy for AI agents to use, including APIs reached through tool-calling layers such as the Model Context Protocol. It gathers these properties into a profile of existing HTTP and API-description mechanisms, with a checklist at the end. It covers the API and its description. It does not define new ways to identify, authenticate, or authorize an agent. That work is happening elsewhere and is referenced here for context.

Introduction Autonomous software agents built on large language models are starting to call HTTP APIs. Sometimes the agent forms the HTTP request itself. More often it goes through a tool-calling layer that presents each API operation as a callable "tool". The Model Context Protocol is one common example of such a layer. Many of its servers are generated automatically from an existing API description such as an OpenAPI document . One property matters most. The description of the API, and the data the API returns, both become input to the agent's own reasoning. Several single points in this document are also ordinary good API practice: stable operation names, bounded pages, structured errors, idempotency. This document explains why each one matters more when the client is an agent. It adds the guidance that is specific to agents. It collects the result as a profile that a provider can apply and check against. This document does not assume that an agent is reliable or well-behaved. It treats the agent as a client whose behavior is shaped by the shape of the API.

Two Layers There are two layers to keep separate. The API layer is the HTTP API and its machine-readable description. The tool layer is the surface that a tool-calling protocol or generator builds from that description. This document is about the API layer. A good API layer is the most reusable way to get a good tool layer. The tool layer is increasingly handled by its own protocols.

API Styles The advice here is written for resource-oriented HTTP APIs, where each operation acts on an identified resource. Most of it also applies to query-based APIs such as GraphQL and to remote-procedure-call styles such as gRPC. A few points change. A query-based API lets the client ask for exactly the fields it needs, which handles the over-fetching concern in on its own. In return it adds query-cost and query-depth concerns that an agent-facing deployment SHOULD limit. A reader using this document with another style should map each point to the matching idea in that style.

Scope and Non-Goals This document is informational. It lists considerations, gives non-normative recommendations, and assembles a profile and a checklist. It does not define a protocol or a wire format. These topics are out of scope:

How an agent proves its identity, authenticates, or is granted authority by a user. That work is active in the IETF, including OAuth 2.0 Token Exchange , on-behalf-of authorization for agents , and wider frameworks . explains the connection.
The internal design of tool-calling protocols, including how those protocols define tools, resources, and prompts.
Agent-side topics such as planning, reasoning traces, and model evaluation.

Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 when, and only when, they appear in all capitals, as shown here. This document is informational. These words give guidance to API authors. They are not protocol requirements.

Terminology

Agent:: An autonomous software system, usually built on a large language model, that picks and calls API operations to reach a goal.
Tool:: An API operation as it is shown to an agent by a tool-calling layer. It has a name, a short description, and an input schema.
API description:: A machine-readable definition of an API, such as an OpenAPI document , that a tool layer can be built from.
Context window:: The limited amount of text the agent's model can hold and read at one time. Operation descriptions and response bodies use up this space.
Affordance:: A machine-readable part of a request or response that tells the agent what it can or should do, in a form it can act on without reading prose.

How an Agent Uses an API The advice below follows from a few facts about how an agent uses an API. Stating them shows where the advice applies.

The description is input. The agent is shown operation names, descriptions, and schemas while it plans and picks a tool. Their wording decides which operation it picks and how it fills in the arguments.
Selection is by similarity. The agent picks an operation because its description reads close to the goal. It does not follow fixed logic. Two operations that read alike cause wrong-operation and wrong-argument mistakes.
Space is limited. Descriptions and responses share one budget of text with the task itself. Large or low-value payloads crowd out the task and make later choices worse.
Retries are common. An agent retries after a timeout or an unclear result. It may also send a request again as its plan changes.
Writes are easy to trigger. An agent can call an operation that changes state while it is exploring. So an unintended or repeated write costs more than it would for a human-driven client.

These facts apply most directly when operation descriptions, schemas, error messages, and response bodies reach a language model or a model-guided planner. When something in between filters, checks, or shortens the data before the model sees it, the space and selection effects shrink but do not disappear. The retry and side-effect effects usually remain.

Prefer Signals the Agent Can Act On This is the main idea of the document. An agent-facing API SHOULD give the agent machine-readable signals it can act on. It SHOULD avoid relying on prose that the agent has to read and interpret. A sentence such as "call this carefully" or "try again later" has to be understood by the model, and the model follows such sentences unevenly. The same intent stated as a field or a flag can be acted on directly. This swap shows up throughout the profile:

an error carries a retryable flag the agent can read ();
a response carries links to the operations that are valid next steps ();
a write accepts a dry-run flag that previews its effect ();
a schema lists allowed values as a fixed set ();
a high-risk operation carries metadata that says it needs confirmation ().

The API Description Is the Contract When an API is reached through a generated tool layer, its machine-readable description is the single source the tools are built from. Anything missing from the description is missing from the tool. So treat the description as a main deliverable. Do not treat it as documentation written after the code.

Give every operation a stable, meaningful identifier, for example the OpenAPI operationId. Identifiers generated from the URL path produce tool names the agent picks badly.
Give each operation a short summary that sets it apart from similar ones. Give it a longer description that says when to use it and when not to. Wording such as "Use this when ..." and "Do not use this when ..." works well. It heads off the wrong-operation mistake from . shows this.
Put the most important information first in the description. The agent may not read all of it.
State side effects in the description. Do not leave them implied by the HTTP method.
Type and describe every parameter and response field. List allowed values as an enumeration, which is a fixed set of permitted values. Set additionalProperties to false on objects, or use the equivalent, so the agent cannot add invented parameters. Include a real example.
Put idempotency support, pagination limits, field-selection or verbosity controls, and deprecation status in the description itself. The agent acts on the description. Guidance kept only in human documentation does not reach it.
Keep the description short and free of repeated prose. Its text uses up the agent's space budget.
Test the description by running an agent against the API and reading where it picks the wrong operation or fills in a bad argument. These wording problems are hard to predict any other way.

Be Consistent and Predictable An agent learns patterns and reuses them. Anything inconsistent across the API makes it guess wrong. Keep naming, identifier formats, pagination, error shape, and authentication the same across the whole API. For example, a status value that is pending on one operation, IN_PROGRESS on another, and done on a third will cause errors. When an operation only works in certain states of a resource, document the states and the moves between them. Then the agent can work out which operations are valid without guessing. Keep enumerated values stable. The agent may have seen and remembered them in an earlier call.

The Set of Operations You Expose A complete resource-oriented API does not become a good tool surface on its own. Turning every endpoint into its own tool gives the agent a surface that is large, low-level, and hard to work through.

Offer composite operations that finish a whole unit of work in one call, where the steps behind them are common, safe, and well-bounded. This saves the agent from chaining several calls, where each call is a separate reasoning step. Keep resource meaning intact where you can. Keep authorization, audit, and partial-failure behavior visible.
Curate the operations you expose. A small set of well-described operations serves the agent better than a large set it has to sort through.
Some operations need extra care before an agent should reach them: the ones that take opaque binary input, change infrastructure, or cause effects that cannot be undone. Expose these to a general-purpose agent only with extra controls, such as a clear schema, a preview, narrow authorization, a confirmation step, or human approval. Do not expose deprecated operations at all.
Offer batch operations that do several related actions in one call. A batch operation SHOULD report one result per item, so the agent can see which items succeeded and which failed.
When a value can be found by a human-meaningful name, accept that name or offer a lookup operation. Do not force the agent to supply an opaque identifier it has no way to get.

Keep Responses Small A response uses up the agent's space budget on the turn it arrives, and on every later turn it stays in view. Asking for too much costs more each turn. A response that is too large is also a denial-of-service and cost problem (see ).

Return the fields that carry signal. Drop low-level details the agent will not use. When clients need different amounts of data, support field selection or a verbosity setting. shows this.
Support conditional requests on reads, for example an entity tag (an ETag value) with the matching conditional header. Then an agent that asks for the same data again can be told it has not changed. It does not have to receive and re-read an identical payload.
Set a bounded default page size on collections. Document how the agent can narrow the results before it fetches them. A large or open-ended default makes the agent spend its space budget on low-value items, and it can make the agent's next tool choice worse.
Use cursor-based pagination. A cursor is an opaque token the server returns that marks where the next page starts. Return a ready-to-use continuation, either that cursor or a complete next_page link, so the agent does not have to compute offsets or build URLs. It does both of those unreliably.
Give collections a stable, documented order. Results that change order between calls confuse the agent.
Make values self-describing when the type alone is unclear. For example, return a money amount together with its currency, not as a bare number. Return a human-readable label next to an identifier where you can.
Include links to the operations that are valid next steps on a returned resource. Then the agent can see the available moves and does not have to guess them.

Make Errors Recoverable An error response is not the end of the story. It is the input to the agent's next decision. An error that is unstructured or unclear leads the agent to retry something that cannot work, or to give up on something a retry would have fixed. Use a structured, machine-readable error format such as Problem Details for HTTP APIs , which is a standard JSON shape for errors. Include a stable error code that is separate from the HTTP status code. Problem Details lets you add your own fields. For an agent-facing API, say plainly whether a failed call can be retried, using a field. Do not leave the agent to read this off the status code. Report every validation problem at once, with the field name for each one. A small, consistent set of fields across the API does the job: a boolean retryable, an optional retry-delay hint, and a list of field errors. With these the agent can decide to retry, to change the request, or to stop, without guessing. The field names here are only examples. This document does not register them. That would be the job of a later specification (see ). Here is a retryable rate-limit error:

Error response with an explicit retry indication A validation error sets retryable to false and lists the fields that were wrong. That tells the agent to change the request before it sends it again:

Validation error that should not be retried unchanged

Make Writes Safe to Repeat, Preview, and Undo Agents retry after timeouts, after unclear results, and as their plan changes. So safe writing for agents covers more than one idea. It covers repeating a write safely, previewing it, confirming it, cancelling it, and undoing it. As a rule, prefer writes that can be previewed, cancelled, and reversed over writes that take effect at once and cannot be undone.

Support a client-supplied idempotency key on operations that have side effects. Idempotency means that sending the same request twice has the same effect as sending it once. With a key, a retry that carries the same key returns the first result and does not run the operation again. The Idempotency-Key HTTP header field is one way to do this. Document the mechanism, the length of time a key is honored, and the scope in which a key must be unique. These decide whether a retry is safe.
Tell a settled result apart from a temporary failure in what you store for a key. A settled result is safe to return again. A temporary server error is not, and returning it again as if it were final is a known trap. Reject a request that reuses a key with a clearly different body. Use a conflict status.
For operations with large or irreversible effects, offer a preview, or dry-run, mode. It reports what the operation would do without doing it. In a resource-oriented API a request header (for example, X-Dry-Run: true) or a query parameter (for example, ?dry_run=true) works well. The response lists the predicted changes. shows this.
Mark high-risk operations with explicit metadata, for example a flag that the operation cannot be undone or that it needs confirmation. You can also require a separate confirmation step outside the request. The point is that the signal is machine-readable, not buried in warning text.

Signal When to Slow Down Agent traffic can come in bursts. It often lacks the natural pacing of a person at a keyboard. Signal clearly when the client should slow down, so a well-behaved agent can hold back on its own.

When you reject or delay a request for rate limiting or a temporary outage, include a retry-delay hint such as the Retry-After header field. A client should follow it ahead of its own backoff timer.
You MAY measure limits in units that fit agent traffic, not a flat request count. One agent task can turn into many calls and large responses. Counting by response size, or by a cost unit you define, can match real load better. Document the unit so the agent can track what it has left.

Long-Running Operations An operation that takes more than a few seconds does not fit a single blocking request. The connection may not stay open. An agent that waits for it cannot do anything else in the meantime.

Acknowledge a long-running operation at once with the right HTTP status (often 202 Accepted). Return an identifier for the operation and a link to a status resource the client can check.
Have the status resource report a clear state, and on completion point to the finished resource. Include a retry-delay hint and say how often to poll. Allow the client to cancel an operation that has not finished.
When the client can receive callbacks, you MAY send a notification when the operation finishes, in place of polling. Authenticate the notification, protect it against replay, and give it an identifier the receiver can use to drop duplicate deliveries.
When an operation produces output piece by piece, you MAY stream the partial output, for example over server-sent events. Then the agent can start acting on early results before the operation finishes.

Changing the API Over Time The agent's behavior comes from the API description. So a change to the description is a change to the agent's tools. Manage change carefully.

Prefer changes that keep working for existing clients, such as adding an optional parameter or a new operation. Save a new version for a change that would break an existing client, such as removing a field, changing a type, or making an optional parameter required.
Do not change what an operation means while keeping its identifier the same. An agent that learned the operation once will use it again and expect the old behavior.
Signal deprecation in machine-readable form, for example the Deprecation and Sunset header fields and a link to the replacement. Show it in the API description too. Release notes on their own do not reach the agent.
Detect breaking changes automatically by comparing each new version of the API description against the last one.
Choose one way to express versions and use it across the whole API. Mixing several versioning styles makes the API harder for the agent and for people to follow.

Making the API Easy to Discover An agent does best with one machine-readable, low-noise place to learn what the API can do. A complete API description is the main artifact for this. Some providers also publish a short, plain-text index of their documentation aimed at models. The llms.txt convention is one community example, and it is not a standard. Generate any such file from the same source as the human documentation, so the two do not drift apart. Keep it focused on parameters, the meaning of errors, and authentication.

Observability After an agent runs, an operator often needs to work out what it did. To let the agent's steps line up with the API's own logs, accept and pass along a standard correlation identifier. The W3C Trace Context traceparent header field is one such identifier, and a generic correlation header is another. Record it in the API's logs next to the identity that acted. This helps with debugging, with working out cost, and with the audit trail in .

How This Relates to Identity and Authorization This document does not say how an agent authenticates to an API, or how a user grants it authority. A single static credential that covers the whole API is a poor fit for an agent. If it leaks, it reaches everything. The direction the field is taking is to issue narrow, short-lived credentials that can be revoked, and to record delegation openly. The user the agent acts for is recorded separately from the agent. That keeps actions traceable. Authors building for agents should follow the relevant work, including OAuth 2.0 Token Exchange , protected resource metadata , resource indicators , dynamic client registration , on-behalf-of authorization for agents , and the wider analysis in .

Security Considerations Building an API for agents changes its security picture. The changes connect to the points above. Injected instructions in text. The agent treats the text it receives as input to its reasoning. So both operation descriptions and response bodies can carry instructions that an attacker planted. This is called indirect prompt injection. Keep trusted control fields apart from untrusted content. Mark natural-language content that came from users or third parties as data, not as instruction. One way is to put it in its own named field and keep a note of where it came from. Do not put externally supplied text into operation descriptions or API documentation unless it is clearly marked as untrusted. Enforce access decisions at the API. Do not rely on instructions in the agent's prompt, because injected text can override them. Running out of room and money. An API can return a payload far larger than expected, and so can a downstream system that has been compromised. A huge payload can fill the agent's space budget and make the task fail. Where use is billed, it can also run up a large, unexpected cost. People often treat the limits on response size and the cursor-based pagination in as performance tuning. For an agent they are also a security control. Enforce them on the server. Do not count on the client to ask for less. Unsafe retries. The idempotency, preview, and undo guidance in is also a security matter. Agents retry more than clients driven by a person, and an unsafe retry can repeat an effect that changes state. Apply least privilege to operations that have real side effects. Keep an audit log that records who acted and under what delegation. Consider requiring a confirmation step for high-risk actions. Wider reach when access is broad. The discovery artifacts in make the API easier for any automated client to read, including a hostile one. They do not replace authorization or rate limiting. Authorization is out of scope here. Still, broad access together with easy discovery widens how much a tricked or compromised agent can reach. Scope access narrowly to the task. That limits the damage an injected instruction can do.

Agent-Friendly Profile Summary An HTTP API meant for agents can be called agent-friendly, in the sense of this document, when:

operation identifiers are stable and reveal intent;
descriptions say when to use and when not to use each operation, and state side effects;
input objects are strictly typed, use fixed value sets, and reject unknown properties;
responses are small by default, with field selection or verbosity controls;
pagination uses cursors and returns a ready-to-use continuation;
reads support conditional requests, so unchanged data need not be sent again;
collections have a stable, documented order;
errors are structured, carry stable codes, and say whether a retry is safe;
state-changing operations support idempotency, with a documented window and scope;
high-risk writes can be previewed, cancelled, or confirmed, and are marked as such;
rate limits, retry delays, and polling advice are machine-readable;
deprecation and replacement operations are machine-readable and shown in the description;
a correlation identifier is accepted, passed along, and logged;
untrusted returned content is kept apart from trusted control fields and marked as data.

IANA Considerations This document has no IANA actions. It does not define any protocol element, header field, media type, or registry entry. The field names used in the error examples in are only examples. If such Problem Details fields are ever standardized, registering them would be the job of that later specification.

References Normative References Key words for use in RFCs to Indicate Requirement Levels Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words HTTP Semantics Problem Details for HTTP APIs Informative References OAuth 2.0 Dynamic Client Registration Protocol OAuth 2.0 Token Exchange Resource Indicators for OAuth 2.0 OAuth 2.0 Protected Resource Metadata OAuth 2.0 Extension: On-Behalf-Of User Authorization for AI Agents IETF Work in Progress, Internet-Draft, draft-oauth-ai-agents-on-behalf-of-user AI Agent Authentication and Authorization Work in Progress, Internet-Draft, draft-klrc-aiagent-auth The Idempotency-Key HTTP Header Field IETF HTTP API Working Group Work in Progress, Internet-Draft, draft-ietf-httpapi-idempotency-key-header Model Context Protocol Specification Model Context Protocol OpenAPI Specification OpenAPI Initiative The /llms.txt file Trace Context W3C W3C Recommendation

Example: MCP Tools for a Project Management System This appendix applies the profile to a Model Context Protocol server for a project management system. The system has projects, tasks, and assignees. Each tool uses a strict JSON Schema, so the model can find and call it without inventing parameters. Every input schema sets additionalProperties to false, uses a fixed value set for closed choices, marks the required fields, and bounds the size of any collection. An agent is often told a project by name, not by identifier. A lookup tool turns the name into an identifier the other tools need. This follows the rule in not to demand an identifier the agent cannot get: A creation tool keeps a strict input schema. Its description, kept short in the example, should also say that the operation notifies the assignee, point the agent to find_projects when it has only a name, and explain that reusing the idempotency_key returns the original task instead of making a duplicate: A search tool bounds its result size and returns an opaque cursor, so the agent never computes an offset. The status filter lets it narrow the results before it fetches them: A result is bounded. It carries human-readable labels and a next action, and it gives back a cursor for the following page: To read the next page, the agent calls the same tool again and passes the cursor straight back. It does no arithmetic: A batch tool creates several tasks in one call. It bounds the array size, and its result reports one outcome per item, so the agent can see which tasks were created and which were rejected: The batch result shows the per-item outcomes. One task was created and one was rejected, and the agent can tell exactly which is which: A state-changing tool uses a fixed value set for the closed list of states and accepts an idempotency key. Its description should note the side effect of notifying the assignee: Each choice maps back to the profile. The intent-revealing names and the use-when guidance reduce wrong-operation mistakes (). The fixed value sets and additionalProperties: false stop the agent from inventing values and parameters. The bounded limit and the opaque cursor follow . The idempotency key and the stated side effects follow . The batch result follows the partial-failure rule in .

Example: A Weak and a Strong Operation Description An operation described only well enough for someone who already knows the system leaves the agent to guess its scope and its side effects: The same operation, described for an agent, names its exact purpose, says when not to use it, and states the side effect: Use this only to change an existing user's email address. Do not use it for other profile fields. Use updateUserProfile for those. Side effect: sends a verification email. The new address stays unverified until the user confirms it. ]]>

Example: A Small Response with Field Selection The request selects three fields and a bounded page, so the response stays small. The response gives back a ready-to-use link for the next page, so the agent does no offset arithmetic:

Example: Previewing a Destructive Write A dry-run request reports what an operation would do, without doing it. The agent can check the predicted effects and decide whether to go ahead: Sending the same request without the X-Dry-Run header performs the operation. The predicted-effects shape is the same, so the agent can use it to decide whether to go ahead or to ask for confirmation first.

Acknowledgments This document gathers considerations discussed across the API design and AI agent engineering communities. The author thanks the authors of the referenced specifications and drafts, whose work covers the identity and authorization topics this document leaves to them, and the reviewers whose feedback shaped the profile structure, the worked examples, and the treatment of context exhaustion and observability.