Autonomy

Can agents recover, adapt, and keep going without human intervention?

An agent that can find, understand, and execute against your API still needs to operate reliably over time without a human stepping in to fix things. Autonomy covers the signals, patterns, and design decisions that let agents detect problems, recover from errors, manage their own context, and maintain stable long-running operations.

Key principles

  • Return structured error responses with remediation hints agents can act on
  • Support idempotent operations so agents can safely retry
  • Provide listing, metadata, and status endpoints for state introspection
  • Offer bulk operations to reduce repetitive calls and context overhead
  • Maintain stable, consistent interfaces agents can depend on

Repairability

When something goes wrong, an agent needs enough information to decide what to do next. Structured error responses allow agents to programmatically determine whether to retry, adjust parameters, or escalate to a human.

Error response design

  • Machine-readable error codes that agents can switch on programmatically
  • Remediation hints that describe what the agent can do differently
  • Retry semantics indicating whether a retry is likely to succeed, and after how long
Better error messages

A 422 response that says "the email field must be a valid email address" is far more useful to an agent than one that says "validation error."

Safe retries

  • Idempotent operations let agents retry without causing duplicate side effects
  • Partial success responses for batch operations tell agents exactly which items succeeded and which failed

State introspection

Agents need to understand the current state of the system they are operating on.

Without state introspection, agents must maintain their own model of system state, which inevitably drifts from reality.

  • Resource listing endpoints - enumerate what exists
  • Metadata endpoints - details about individual resources
  • Usage and quota inspection - remaining capacity
  • Status endpoints - health and availability of the system

Context efficiency

Agents operate within context windows and reasoning budgets. Every API call that returns a large response consumes context that could be used for reasoning.

  • Bulk operations - accomplish in one call what would otherwise take ten
  • Selective field retrieval - let agents request only the data they need
  • Pagination with sensible defaults - avoid overwhelming responses
  • Streaming responses - handle large datasets incrementally

Consider what the agent needs to remember across reasoning loops and minimize the information it must carry forward.

Reliability

Agents depend on consistent behavior. If an endpoint returns a different response shape depending on conditions the agent cannot predict, it cannot write reliable code against your API.

  • Consistent schemas - same structure every time
  • Predictable response structures - no surprises in edge cases
  • Low latency - keep agents responsive
  • Predictable cost structures - let agents budget their operations
  • Batching and streaming - options for managing performance tradeoffs