I have:
I want domain.com to be crawled and indexed by search engines, but not testing.domain.com
The testing domain and main domain share the same SVN repository, so I'm not sure if separate robots.txt files would work...
Regardless of whether it is on a subdomain or not, to prevent a page being indexed you could: --- in the head section of the page, use a meta robots meta tag set to "noindex". OR: --- use an X-ROBOTS-TAG set to "noindex".
Google does crawl Sub-Domains which are the pages generating from your main page. But it might create duplicate pages and getting penalty from Google. So you may need to disallow spiders from crawling your page.
Another way you can remove your subdomain from Google search is via user header x tags – X-Robots-Tag: noindex. This is usually very effective for removing each page in a domain or subdomain. Some webmasters or programmers like this method since it does not alter or change the way you create a page.
1) Create separate robots.txt file (name it robots_testing.txt, for example).
2) Add this rule into your .htaccess in website root folder:
RewriteCond %{HTTP_HOST} =testing.example.com
RewriteRule ^robots\.txt$ /robots_testing.txt [L]
It will rewrite (internal redirect) any request for robots.txt to robots_testing.txt IF domain name = testing.example.com.
Alternatively, do opposite -- rewrite all requests for robots.txt to robots_disabled.txt for all domains except example.com:
RewriteCond %{HTTP_HOST} !=example.com
RewriteRule ^robots\.txt$ /robots_disabled.txt [L]
testing.domain.com should have it own robots.txt file as follows
User-agent: *
Disallow: /
User-agent: Googlebot
Noindex: /
located at http://testing.domain.com/robots.txt
This will disallow all bot user-agents and as google looks at the Noindex as well we'll just though it in for good measure.
You could also add your sub domain to webmaster tools - block by robots.txt and submit a site removal (though this will be for google only). For some more info have a look at http://googlewebmastercentral.blogspot.com/2010/03/url-removal-explained-part-i-urls.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With