• About WordPress
    • WordPress.org
    • Documentation
    • Learn WordPress
    • Support
    • Feedback
Skip to content
May 25, 2026
  • Linkedin
  • Twitter
  • Facebook
  • Youtube

Daily CyberSecurity

Zero-hour alerts. Unmatched analysis.

Primary Menu
  • Home
  • CVE Watchtower
  • Cyber Criminals
  • Data Leak
  • Linux
  • Malware
  • Vulnerability
  • Submit Press Release
  • Vulnerability Report
Light/Dark Button
  • Home
  • Technique
  • The robots.txt file explained
  • Technique

The robots.txt file explained

Ddos May 24, 2017 3 minutes read

What is a robots.txt file?

Search engine through a program robot (also known as spider), automatically access the Internet page and access to web information.
You can create a plain text file, robots.txt, in your website that declares that the site does not want to be accessed by the robot so that part or all of the site’s content can not be included in the search engine, or Specifies that the search engine only includes the specified content.

Where is the robots.txt file?

The robots.txt file should be placed in the root directory of the site. For example, when a robots visit a website (such as http://www.abc.com ), it will first check whether the site exists http://www.abc.com/robots.txt this file, if the robot to find this file, it will be based on the contents of this file to determine the scope of its access.

The format of the robots.txt file

The “robots.txt” file contains one or more records, separated by blank lines (CR, CR / NL, or NL as the end), and the format of each record is as follows:

"<field>:<optionalspace><value><optionalspace>"

 

In the file can be used # for annotations, the specific use of the same practice and UNIX. The records in this file usually begin with one or more lines of User-agent, followed by a number of Disallow lines, as follows:

  • User-agent:
    The value of this item is used to describe the name of the search engine robot. In the “robots.txt” file, if there are multiple User-agent records that have multiple robots that are limited by the protocol, Say, at least one User-agent record. If the value is set to *, the protocol is valid for any robot. In the “robots.txt” file, there is only one record of “User-agent: *”.
  • Disallow:
    the value of the item used to describe the URL you do not want to visit, the URL can be a complete path, it can be part of any Disallow at the beginning of the URL will not be access to the robot. For example, “Disallow: /help” does not allow search engine access to /help.html and /help/index.html, and “Disallow: /help/” allows the robot to access /help.html without access to /help/index .html. Any Disallow record is empty, indicating that all parts of the site are allowed to be accessed, in the “/robots.txt” file, at least one Disallow record. If “/robots.txt” is an empty file, then for all the search engine robot, the site is open.

Share this article:

Facebook Post LinkedIn Telegram

Related posts:

  1. Reddit Sues Anthropic: Battling Unauthorized AI Data Scraping!
  2. Cloudflare Unveils AI Crawler Leaderboard: ByteDance Ranks Last
  3. Cloudflare Unveils New Policy to Let Creators Control How AI Bots Use Their Content
  4. Reddit Sues Perplexity and Data Brokers for ‘Industrial-Scale’ Scraping
  5. The Robots.txt Myth: Does a Missing File Really Delete Your Site from Google?
Tags: robots.txt

Search

Translation

CVE WATCHTOWER
🚨

Receive alerts for vulnerabilities being exploited in the wild.

⚑

Get notified instantly when a Proof of Concept (PoC) exploit is published.

πŸ”

Access critical info on vulnerabilities even when marked as "RESERVED".

🧠

Insights powered by decades of expertise and global intelligence sources.

🎯

Customize alerts with up to 10 keywords for your specific tech stack.

πŸ“Š

Export the raw CVE database for SIEM integration and reporting.

Upgrade Package

πŸ”΄ Live Critical Threats

  • CVE-2026-9458CVSS 9.8
    A vulnerability was identified in Totolink A8000RU 7.1cu.643_b20200521. The impacted element is...
  • CVE-2026-9457CVSS 9.8
    A vulnerability was determined in Totolink A8000RU 7.1cu.643_b20200521. The affected element is...
  • CVE-2026-9456CVSS 9.8
    A vulnerability was found in Totolink A8000RU 7.1cu.643_b20200521. Impacted is the function...
  • CVE-2026-9455CVSS 9.8
    A vulnerability has been found in Totolink A8000RU 7.1cu.643_b20200521. This issue affects...
  • CVE-2026-9454CVSS 9.8
    A flaw has been found in Totolink A8000RU 7.1cu.643_b20200521. This vulnerability affects...
  • CVE-2026-9436CVSS 9.8
    A flaw has been found in Totolink A8000RU 7.1cu.643_b20200521. The impacted element...
  • CVE-2026-9435CVSS 9.8
    A vulnerability was detected in Totolink A8000RU 7.1cu.643_b20200521. The affected element is...
  • CVE-2026-9434CVSS 9.8
    A security vulnerability has been detected in Totolink A8000RU 7.1cu.643_b20200521. Impacted is...
  • CVE-2026-9433CVSS 9.8
    A weakness has been identified in Totolink A8000RU 7.1cu.643_b20200521. This issue affects...
  • CVE-2026-2651CVSS 9.0
    A vulnerability in MLflow versions
Powered by CVE WATCHTOWER

Recent Zero-Day Vulnerabilities

  • Exploited in the Wild: Critical OWA Spoofing Flaw (CVE-2026-42897) Hits On-Premises Exchange Servers
  • Exploited in the Wild: Maximum CVSS 10 SD-WAN Flaw (CVE-2026-20182) Grants Admin Control
  • Exploited in the Wild: Critical 9.8 CVSS RCE Hits Canon GUARDIANWALL MailSuite
  • Exploit Code Released: Public PoC Dumps for Windows BitLocker Bypass and SYSTEM Elevation Zero-Days
  • Exploited in the Wild: “Dirty Frag” Linux Vulnerability Grants Instant Root Access
  • Under Active Attack: Ivanti EPMM Zero-Day Exploited in the Wild via Harvested Admin Credentials
Our Websites
  • Penetration Testing Tools
  • The Daily Information Technology
  • Daily CyberSecurity

    • About SecurityOnline.info
    • Advertise with us
    • Announcement
    • Contact
    • Contributor Register
    • Login
    • About SecurityOnline.info
    • Advertise on SecurityOnline.info
    • Contact Us

    When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works

    • Disclaimer
    • Privacy Policy
    • DMCA NOTICE
    • Linkedin
    • Twitter
    • Facebook
    • Youtube
    Copyright Daily CyberSecurity Β© All rights reserved.