Back to blog
RFC 9457: Structured Errors Cut AI Agent Costs 98%
ai-engineering

RFC 9457: Structured Errors Cut AI Agent Costs 98%

Cloudflare now returns RFC 9457 structured errors to AI agents, replacing HTML with machine-readable JSON. Token costs drop 98% with zero config.

HA

Hamza Abdagic

Publisher

March 13, 2026

5 min read

The Hidden Tax on AI Agents

Every time an AI agent encounters an error on the web, it pays a tax that most teams never measure. A standard Cloudflare error page — the kind returned for rate limits, DNS failures, or access denials — contains roughly 18,000 tokens of HTML when processed by a language model. The agent parses navigation elements, footer links, styling markup, and decorative content to extract a single piece of information: what went wrong and whether to retry.

Multiply this across the thousands of HTTP requests an autonomous agent makes per session, and the cost becomes significant. At current LLM pricing, error handling alone can consume a meaningful percentage of an agent's per-task budget. The problem is not that errors occur — it is that error responses were designed for humans reading browsers, not machines making decisions.

Cloudflare's implementation of RFC 9457-compliant error responses addresses this directly, and the implications extend well beyond cost savings.

How RFC 9457 Changes Error Handling

RFC 9457 defines a standard format for machine-readable HTTP error responses. When an AI agent sends a request with Accept: application/json or Accept: application/problem+json, Cloudflare now returns a structured payload instead of an HTML page. The response includes:

  • error_code and error_category — Machine-parseable classification of the failure type
  • retryable and retry_after — Explicit signals for backoff logic, eliminating guesswork
  • owner_action_required — Indicates whether the error requires human intervention or can be resolved programmatically
  • error_name — Human-readable description for logging and debugging

The token reduction is dramatic: over 98 percent fewer tokens compared to the equivalent HTML error page. An 18,000-token HTML page becomes a structured response that an agent can process in under 100 tokens.

Why This Matters Beyond Cost

The cost savings are the most cited benefit, but the architectural implications are more important. Structured error responses enable fundamentally better agent behavior:

Deterministic retry logic. When an error response explicitly states retryable: true with a retry_after value, the agent does not need to guess whether retrying is appropriate. This eliminates a category of agent failures where models incorrectly interpret HTML error pages and either retry when they should not or abandon requests that would succeed on retry.

Faster error classification. Instead of asking the language model to interpret an HTML page and determine the error type, the agent reads a structured field. This removes an inference call from the error handling path entirely, reducing both latency and cost.

Reduced hallucination risk. When agents process large HTML error pages, the irrelevant content can influence subsequent reasoning. A structured 100-token response keeps the agent's context focused on the actual problem.

Implementation and Backward Compatibility

Cloudflare's implementation covers all 1xxx-class error paths — edge-side failures including DNS resolution issues, access denials, and rate limits. The same structured format will extend to Cloudflare-generated 4xx and 5xx errors next.

The deployment requires zero configuration from site owners. Browsers continue receiving HTML unless the client explicitly requests JSON or Markdown via the Accept header. This means existing web traffic is unaffected while agents automatically receive optimized responses.

For teams building AI agents that interact with web services, this sets a precedent. The pattern is simple and replicable:

  1. Send appropriate Accept headers. Agents should request application/json or application/problem+json to signal machine consumption.
  2. Implement RFC 9457 in your own APIs. If you operate services that AI agents consume, returning structured errors with retry guidance reduces the burden on every agent that interacts with your endpoints.
  3. Measure token consumption on error paths. Most teams optimize happy-path token usage and ignore error handling. With agents making hundreds of requests per task, error path efficiency compounds.
  4. Design for machine-first, human-second. As AI agent traffic grows, APIs that serve structured responses by default and HTML as an enhancement will outperform those designed exclusively for browser consumption.

RFC 9457 adoption by a provider at Cloudflare's scale signals that the web's error handling infrastructure is being rebuilt for an agent-driven future. Teams that adopt the standard now will build agents that are cheaper, faster, and more reliable than those still parsing HTML.

Sources

Tags

ai-agentsllm-opscloudflareapi-designperformance