Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can domains be used in robots.txt?

Tags:

robots.txt

We have a development server at dev.example.com that is being indexed by Google. We are using AWS Lightsail to duplicate the development server to our production environment in totality — the same robots.txt file is used on both dev.example.com and example.com.

Google's robots.txt documentation doesn't explicitly state whether root domains can be defined. Can I implement domain specific rules to the robots.txt file? For example, is this acceptable:

User-agent: *
Disallow: https://dev.example.com/

User-agent: *
Allow: https://example.com/

Sitemap: https://example.com/sitemap.xml

To add, this can be resolved through .htaccess rewrite engine — my question is specifically about robots.txt.

like image 887
franklylately Avatar asked Nov 04 '25 06:11

franklylately


1 Answers

No, you can't specify domain in robots.txt. Disallow: https://dev.example.com/ is not valid. Page 6 of the robots.txt exclusion standard says that a disallow line should contain a "path" as opposed to a full URL including the domain.

Each host name (domain or subdomain) has its own robots.txt file. So to prevent Googlebot from crawling http://dev.example.com/ you would need to serve https://dev.example.com/robots.txt with the content:

User-agent: *
Disallow: /

At the same time you would need to serve a different file from http://example.com/, perhaps with the content:

User-agent: *
Disallow:

Sitemap: https://example.com/sitemap.xml

If the same code base powers both your dev and production servers, you will need to conditionalize the content of robots.txt based on whether it is running in production or not.

Alternately, you could allow Googlebot to crawl both, but include <link rel=canonical href=...> tags in every page that point to the URL for the page on the live site. See How to use rel='canonical' properly

like image 132
Stephen Ostermiller Avatar answered Nov 07 '25 16:11

Stephen Ostermiller



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!