Robots.txt errors
- Blocking the whole site after launch.
- Pointing to an old sitemap domain.
- Listing private paths that should be protected instead.
- Using robots rules as a data handling control.
Sitemap errors
- Including redirected or old-domain URLs.
- Including noindex or hidden pages.
- Including API, admin, or temporary file URLs.
- Using inaccurate lastmod timestamps.
Good review habit
Fetch both files after deployment, parse the sitemap, and test a sample of listed URLs. The files should reflect the current public site, not a previous version.
FAQ
Should sitemap.xml include query URLs?
Usually no. Use clean canonical URLs.
Can robots.txt fix broken pages?
No. Fix the route or remove the URL from public lists and sitemap.