Skip to content
June 19, 2026
  • Linkedin
  • Twitter
  • Facebook
  • Youtube

Daily CyberSecurity

Zero-hour alerts. Unmatched analysis.

Primary Menu
  • Home
  • CVE Watchtower
  • Cyber Criminals
  • Data Leak
  • Linux
  • Malware
  • Vulnerability
  • Submit Press Release
  • Vulnerability Report
Light/Dark Button
  • Home
  • Technique
  • The robots.txt file explained
  • Technique

The robots.txt file explained

Do Son May 24, 2017 3 minutes read

What is a robots.txt file?

Search engine through a program robot (also known as spider), automatically access the Internet page and access to web information.
You can create a plain text file, robots.txt, in your website that declares that the site does not want to be accessed by the robot so that part or all of the site’s content can not be included in the search engine, or Specifies that the search engine only includes the specified content.

Where is the robots.txt file?

The robots.txt file should be placed in the root directory of the site. For example, when a robots visit a website (such as http://www.abc.com ), it will first check whether the site exists http://www.abc.com/robots.txt this file, if the robot to find this file, it will be based on the contents of this file to determine the scope of its access.

The format of the robots.txt file

The “robots.txt” file contains one or more records, separated by blank lines (CR, CR / NL, or NL as the end), and the format of each record is as follows:

"<field>:<optionalspace><value><optionalspace>"

 

In the file can be used # for annotations, the specific use of the same practice and UNIX. The records in this file usually begin with one or more lines of User-agent, followed by a number of Disallow lines, as follows:

  • User-agent:
    The value of this item is used to describe the name of the search engine robot. In the “robots.txt” file, if there are multiple User-agent records that have multiple robots that are limited by the protocol, Say, at least one User-agent record. If the value is set to *, the protocol is valid for any robot. In the “robots.txt” file, there is only one record of “User-agent: *”.
  • Disallow:
    the value of the item used to describe the URL you do not want to visit, the URL can be a complete path, it can be part of any Disallow at the beginning of the URL will not be access to the robot. For example, “Disallow: /help” does not allow search engine access to /help.html and /help/index.html, and “Disallow: /help/” allows the robot to access /help.html without access to /help/index .html. Any Disallow record is empty, indicating that all parts of the site are allowed to be accessed, in the “/robots.txt” file, at least one Disallow record. If “/robots.txt” is an empty file, then for all the search engine robot, the site is open.

Share this article:

Facebook Post LinkedIn Telegram
Tags: robots.txt

Search

Translation

CVE WATCHTOWER
🚨

Receive alerts for vulnerabilities being exploited in the wild.

⚡

Get notified instantly when a Proof of Concept (PoC) exploit is published.

🔍

Access critical info on vulnerabilities even when marked as "RESERVED".

🧠

Insights powered by decades of expertise and global intelligence sources.

🎯

Customize alerts with up to 10 keywords for your specific tech stack.

📊

Export the raw CVE database for SIEM integration and reporting.

Upgrade Package

🔴 Live Critical Threats

  • CVE-2026-50242CVSS 10.0
    In JetBrains Hub before 2026.1.13757, 2025.3.148033, 2025.2.148048, 2025.1.148120, 2024.3.148430, 2024.2.148429 authentication bypass...
  • CVE-2026-56142CVSS 9.6
    In JetBrains Hub before 2026.1.13757, 2025.3.148033, 2025.2.148048, 2025.1.148120, 2024.3.148430, 2024.2.148429 privilege escalation...
  • CVE-2026-56141CVSS 9.8
    In JetBrains Hub before 2026.1.13757, 2025.3.148033, 2025.2.148048, 2025.1.148120, 2024.3.148430, 2024.2.148429 account takeover...
  • CVE-2026-54414CVSS 9.8
    FileRise before 3.16.0 is vulnerable to path traversal in the shared-folder upload...
  • CVE-2026-7515CVSS 9.8
    The BetterDocs Pro plugin for WordPress is vulnerable to Local File Inclusion...
  • CVE-2026-8713CVSS 9.1
    The Avada (Fusion) Builder plugin for WordPress is vulnerable to arbitrary file...
  • CVE-2026-40624CVSS 9.8
    Improper input validation in AVer PTC500S, PTC115, PTC500+, and PTC115+ cameras may...
  • CVE-2026-12048CVSS 9.3
    Stored cross-site scripting in pgAdmin 4's error-rendering and plan-node-rendering paths. Text returned...
  • CVE-2026-12046CVSS 9.0
    Two state-mutating endpoints in pgAdmin 4's SQL Editor blueprint -- DELETE /sqleditor/close/...
  • CVE-2026-12045CVSS 9.0
    Read-only transaction bypass in the pgAdmin 4 AI Assistant allows an attacker...
Powered by CVE WATCHTOWER

Recent Zero-Day Vulnerabilities

  • GreatXML BitLocker Bypass: Public PoC Exploit Disclosed
  • Check Point VPN Vulnerability Exploited in the Wild with Ransomware Links
  • Weekly Threat Intelligence: June 1 to June 7, 2026
  • Cisco SD-WAN Vulnerability Exploited in the Wild with Root RCE Risks
  • Android Zero-Day Flaw Exploited in the Wild: June 2026 Patches Released
  • Exploited in the Wild: Critical OWA Spoofing Flaw (CVE-2026-42897) Hits On-Premises Exchange Servers
Our Websites
  • Penetration Testing Tools
  • The Daily Information Technology
  • Daily CyberSecurity

    • About SecurityOnline.info
    • Advertise with us
    • Announcement
    • Contact
    • Contributor Register
    • Login
    • About SecurityOnline.info
    • Advertise on SecurityOnline.info
    • Contact Us

    When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works

    • Disclaimer
    • Privacy Policy
    • DMCA NOTICE
    • Linkedin
    • Twitter
    • Facebook
    • Youtube
    © 2017 - 2026 Daily CyberSecurity. All Rights Reserved.