What robots.txt is for
A robots.txt file gives crawl instructions to well-behaved bots. It can allow or disallow paths and point to a sitemap. It is useful for reducing crawler noise, but it is not access control and should never be used to hide secrets.
What to review
- The file returns a 200 response at /robots.txt.
- Important public pages are not accidentally disallowed.
- The sitemap line points to the correct domain.
- Development, admin, and query-heavy paths are handled deliberately.
- Rules are simple enough to maintain.
Common mistakes
A common launch mistake is leaving a staging rule that blocks the whole site. Another is putting private URLs into robots.txt, which can make sensitive-looking paths easier to discover.
FAQ
Does disallow remove a page from search results?
Not always. If other pages link to it, a blocked URL can still appear without content. Use noindex or access control where appropriate.
Should robots.txt include the sitemap?
Yes. It is a simple way to help crawlers find the preferred sitemap location.