Why Reddit Blocked Bing and Other Search Engines

Do Son July 31, 2024 3 minutes read

Recently, the well-known online community Reddit has banned all search engines except Google, allowing only Google to continue indexing Reddit’s content. The reason is straightforward: Google pays Reddit $60 million annually for content licensing, enabling it to scrape this content for training artificial intelligence.

Other search engine developers, unwilling to pay the fee, have naturally been banned. At that time, Bing’s search director mentioned that as early as September 2023, Bing had provided all websites with crawl controls, which could be used to manage Bing’s crawling activities.

However, the Bing director later revealed that Reddit has indeed blocked Bing’s crawlers and other data scrapers. This not only affects Bing’s ability to retrieve content from Reddit but also impacts other search engines that rely on Bing, such as DuckDuckGo.

Consequently, users can no longer search for Reddit content through Bing and DuckDuckGo; they must switch to Google to find more useful or recent posts and comments on Reddit through search engines.

Microsoft has been acquiring data from Reddit and utilizing it to train their artificial intelligence models, while also indexing Reddit’s content in Bing search “without telling us.”

Additionally, two other AI developers, Anthropic (creator of Claude) and Perplexity (developer of the eponymous AI search engine), have also been training their systems using data from Reddit.

Microsoft, Anthropic, and Perplexity all operate under the assumption that all content on the internet is freely available for their use, revealing their true stance on data usage.

“We’ve had Microsoft, Anthropic, and Perplexity act as though all of the content on the internet is free for them to use,” Huffman said. “That’s their real position.”

Steve Huffman also stated that blocking these companies is quite troublesome. He believes that the traditional situation where search engines retrieve content from websites without providing any compensation is changing, as the value of scraping content in exchange for traffic is becoming ambiguous.

In the traditional model, search engines index website content and display it when users search, thereby driving traffic to websites and converting it into revenue. Now, however, search engines scrape data for model training, which Huffman feels is no longer an equivalent exchange.

Additionally, reports indicate that companies like Microsoft have refused to negotiate content licensing with Reddit. Even though their search engines have been banned, they will not pay any fees to Reddit to lift the ban or purchase content data.

Via: theverge

Support Our Threat Intelligence

If you find our CVE report and cybersecurity news helpful, consider supporting our work.

Buy Me a Coffee PayPal

Written by

@DdoS · Security Researcher

Do Son

Do Son is the Founder and Editor of SecurityOnline.info. Working in cybersecurity since 2013, he reports on vulnerabilities, malware, and emerging threats, providing timely analysis to help organizations and individuals stay ahead of evolving risks.

Related Posts:

Get Zero-Hour Vulnerability Alerts

Support Our Threat Intelligence

Do Son

Leave a Reply Cancel reply