As search engines gradually evolve into “answer engines,” users can now obtain information without ever visiting the original website. For publishers, news outlets, and creators, this shift translates into declining traffic, shrinking revenue, and a growing loss of control over how their content is used. In response to this challenge, Cloudflare has introduced a new Content Signals Policy—a framework that allows website owners and creators to clearly express their preferences regarding how AI companies may access and use their material. Can AI use your content? How can it use it? Can it train models with it? This policy aims to answer those questions.
In essence, Cloudflare will help users update their robots.txt files—the small text file that tells web crawlers which areas of a website can or cannot be accessed. Traditionally, this file did not specify how the retrieved data could be used. Under the new Content Signals Policy, websites can now communicate their wishes to AI systems in a machine-readable format: “Yes” to allow usage, “No” to deny it, and no signal to indicate no explicit preference.
The policy also distinguishes between different AI use cases—such as search, input, and training—so that website owners can, for example, allow their content to be used for user-facing results while prohibiting its use for AI model training, or they can opt out of all AI access entirely.
Cloudflare co-founder and CEO Matthew Prince stated, “The internet cannot afford to wait for a solution to appear. Creators have the right to decide who uses their content, and there must be a clear way to communicate that choice to AI companies.” He added that the enhanced robots.txt is more than a technical update—it is “a way to send a definitive signal to the industry: creators’ intentions must not be ignored.”
For website operators, this system is designed to be intuitive. For example, a news outlet that once attracted hundreds of thousands of daily clicks might now see reduced traffic as AI tools answer user queries directly. With the new Content Signals Policy, that outlet can specify in its robots.txt file, “AI may not train on this content.” Even if crawlers retrieve the data, the policy provides a clear and potentially legally enforceable framework governing its use.
Currently, more than 3.8 million domains use Cloudflare’s robots.txt management service to declare that their content should not be used for AI training. With this policy, users can now set additional preferences and issue explicit instructions to all forms of automated access, including AI crawlers. Cloudflare has also released tools and sample templates for those who wish to customize their robots.txt files.
Industry response has been largely positive. Danielle Coffey, President of the News Media Alliance, said the policy empowers publishers to reclaim control of their intellectual property and continue funding quality journalism. Quora and Reddit praised Cloudflare’s efforts to establish transparency and respect between AI companies and content creators, while RSL Collective and Stack Overflow noted that the policy not only safeguards creators’ rights but also promotes a sustainable and equitable web ecosystem where creators and platforms can coexist in the age of AI.
For individual creators and small websites, the policy offers equally practical benefits. A tech blogger, for instance, can specify that “AI may display summaries but not train on full text,” while a handmade craft store might block AI from scraping product images to train visual recognition models. These measures protect intellectual property while giving creators greater confidence to keep producing original content.
Cloudflare emphasized that all customers using its robots.txt management service will be automatically updated to include the new policy language. Those wishing to implement custom configurations can access comprehensive tools and examples to ensure their “content usage rights” remain firmly in their own hands.
As AI becomes ever more intertwined with the fabric of the internet, Cloudflare’s Content Signals Policy represents a clear, machine-readable language of consent—a vital step toward establishing balance and mutual respect between creators, platforms, and AI developers.