AI's New Attack Vector: How Real-Time Bots Are Straining Websites • Daily CyberSecurity

AI’s New Attack Vector: How Real-Time Bots Are Straining Websites

Ddos August 21, 2025 3 minutes read

The prominent U.S. CDN provider Fastly has released its Q2 2025 Threat Defense Report, revealing that AI-driven bots are reshaping web traffic patterns, with the most significant risks stemming not from data-gathering crawlers, but from real-time inference queries performed during model usage.

According to the report, nearly 80% of AI-related traffic originates from crawlers harvesting training data. While this volume is substantial, the true threat lies in inference-phase scraping, where AI platforms, in the course of responding to user prompts, issue live queries across the internet to retrieve information.

At peak load, these real-time queries can bombard a single website with up to 39,000 requests per minute, far exceeding the roughly 1,000 requests per minute typically generated by training-data crawlers. Once complete, AI bots may embed a handful of links in their responses for user verification, even though they may have queried hundreds of sites to construct an answer.

If websites lack proper concurrency controls or defensive measures, such bursts of requests can mimic the effects of a Distributed Denial of Service (DDoS) attack, overwhelming servers and leading to congestion or outright outages.

As for traffic sources, the report notes that the overwhelming majority of AI crawler activity originates from Meta, Google, and OpenAI, which together account for 95% of observed crawler traffic—with Meta at 52%, Google at 23%, and OpenAI at 20%.

In the realm of real-time inference scraping, however, OpenAI dominates, with its ChatGPT-User and OAI-SearchBot crawlers responsible for 98% of this traffic. Unlike training crawlers, these bots serve as live agents retrieving web content on behalf of user queries.

Regionally, North American websites see 90% of AI traffic from training crawlers, while in Europe, the Middle East, and Africa, the balance tilts the other way, with 59% stemming from real-time inference queries. The Asia-Pacific and Latin American regions remain dominated by training-data crawlers.

In terms of content sourcing, OpenAI’s GPTBot (dedicated to training data collection) has the widest reach, with coverage extending to 95% of unique websites in the dataset. OpenAI’s strategy appears to favor maximum breadth, crawling as many sites as possible, while Meta pursues depth, indexing fewer domains but attempting to exhaustively capture their content.

Rate this post

Support Our Threat Intelligence

If you find our CVE report and cybersecurity news helpful, consider supporting our work.

Buy Me a Coffee PayPal

Tags: AI Bots cybersecurity ddos Fastly large language models scraping web traffic

Critical Alert 1 Active Exploit Detected Today

Leave a Reply Cancel reply

Critical Alert 1 Active Exploit Detected Today

Related Posts:

Support Our Threat Intelligence

Related posts:

Leave a Reply Cancel reply