Robots.txt Parser

Parse and analyze robots.txt files to understand crawler directives. Validate syntax, check rules for specific user agents, and test whether URLs are blocked or allowed by your robots.txt configuration.

robots.txt Content

About robots.txt

The robots.txt file tells search engine crawlers which URLs they can access on your site. This tool parses and validates robots.txt files and lets you test specific URLs.

Common Directives

  • User-agent: - Specifies which crawler the rules apply to
  • Disallow: - Blocks access to specified paths
  • Allow: - Explicitly allows access (overrides disallow)
  • Sitemap: - Points to your XML sitemap
  • Crawl-delay: - Sets delay between requests (seconds)

FAQ

How do I read a robots.txt file?

A robots.txt file contains rules organized by user-agent (crawler name). Each section specifies Allow and Disallow directives that define which paths the crawler can or cannot access. The wildcard * matches all user agents.

What does the Disallow directive do?

The Disallow directive tells crawlers not to access specified paths. For example, Disallow: /private/ blocks access to all URLs starting with /private/. An empty Disallow means nothing is blocked for that user agent.

Can I test if a URL is blocked?

Yes, this tool allows you to enter a robots.txt file and test specific URLs to see if they would be blocked or allowed for different user agents. This helps verify your rules work as intended.

Related Tools