The Signal
The vast majority of website owners remain effectively invisible to AI-specific crawling protocols. With 97% of llms.txt files seeing zero traffic, the current attempt to establish a standard “opt-in” for AI data usage is failing to gain traction.
What Happened
Ahrefs analyzed 137,000 domains to evaluate the current state of llms.txt implementation. The data confirms that despite being proposed as an industry standard for AI indexing, it remains largely ignored by both site owners and AI crawlers. AI-specific retrieval bots account for a negligible 1% of total request volume to these files, indicating a breakdown in the expected handshake between publishers and model developers.
Why It Matters
First-order: The llms.txt standard currently lacks the “teeth” or utility to justify the technical overhead for most site operators. If your team is prioritizing this as a primary defense or engagement layer for AI traffic, you are likely misallocating engineering resources.
Second-order: This mirrors the early days of robots.txt, where lack of universal bot compliance led to a period of chaotic scraping. Until major LLM providers (OpenAI, Anthropic, Google) force or incentivize compliance with these files, they remain a theoretical exercise rather than a functional governance tool.
Third-order: The failure of a voluntary standard will likely accelerate the push for legal and regulatory frameworks for web data access. As publishers realize that “polite” requests are ignored, the industry will shift from technical implementation to contractual and legal barriers to keep training data private.
The Numbers
- 97% of
llms.txtfiles receive zero requests (Source: Ahrefs) - 1% of total site requests come from AI retrieval bots (Source: Ahrefs)
What To Watch
- Platform Mandates: Watch for major model builders to mandate
llms.txtin their terms of service to avoid future litigation, which would instantly force 100% adoption. - Tooling Consolidation: Expect CMS plugins (WordPress, Shopify) to automate
llms.txtgeneration; adoption will only scale once it is a “set and forget” feature rather than a manual task. - Traffic Shift: Monitor if AI-referral traffic increases as these models move from “training” (ingestion) to “reasoning” (live agentic browsing), which may force better bot compliance with site directives.