Robots.txt: Control How Search Engines Crawl Your Site

What is Robots.txt?

Robots.txt is a text file at your website's root that tells search engine crawlers which pages they can or cannot access.

How It Works

Crawlers check /robots.txt before crawling:

Bot requests yoursite.com/robots.txt

Reads allow/disallow rules

Follows (or ignores) based on configuration

Basic Syntax

Allow All

User-agent: *

Allow: /

`Block All`

User-agent: *

Disallow: /

`Block Specific Directories`

User-agent: *

Disallow: /admin/

Disallow: /private/

Disallow: /temp/

`Specific Bot Rules`

User-agent: Googlebot

Allow: /

User-agent: GPTBot

Disallow: /

`Common Use Cases`

`Block Admin Areas`

Disallow: /wp-admin/

Disallow: /admin/

`Block Search Results`

Disallow: /search

Disallow: /*?s=

`Block AI Crawlers`

User-agent: GPTBot

User-agent: CCBot

User-agent: anthropic-ai

Disallow: /

`Important Notes`

`Robots.txt Is Public`


Anyone can view your robots.txt—don't use it to hide sensitive URLs.
Not a Security Measure
Robots.txt is a suggestion, not enforcement. Use authentication for true protection.
Include Sitemap Reference

Sitemap: https://example.com/sitemap.xml

Check your robots.txt with our free validation tool.