I have a small magento site which consists of page URLs such as:
http://www.example.com/contact-us.html
http://www.example.com/customer/account/login/
However I also have pages which include filters (e.g. price and colour) and two such examples are:
http://www.example.com/products.html?price=1%2C1000
http://www.example.com/products/chairs.html?price=1%2C1000
The issue is that when Google bot and the other search engine bots search the site, it essentially grinds to a halt because they get stuck in all the "filter links".
So, in the robots.txt file how can it be configured e.g:
User-agent: *
Allow:
Disallow:
To allow all pages like:
http://www.example.com/contact-us.html
http://www.example.com/customer/account/login/
to get indexed but in the case of http://www.example.com/products/chairs.html?price=1%2C1000 index products.html, but ignore all the content after the ??
The same should apply for http://www.example.com/products/chairs.html?price=1%2C1000
I also don't want to have to specify each page, in turn just a rule to ignore everything after the ? but not the main page itself.
I think this will do it:
User-Agent: *
Disallow: /*?
That will disallow any url that contains a question mark.
If you want to disallow just those that have ?price, you would write:
Disallow: /*?price
See related questions (list on the right) such as:
Restrict robot access for (specific) query string (parameter) values?
How to disallow search pages from robots.txt
Additional explanation:
The syntax Disallow: /*? says, "disallow any url that has a question mark in it." The / is the start of the path-and-query part of the url. So if your url is http://mysite.com/products/chairs.html?manufacturer=128&usage=165, the path-and-query part is /products/chairs.html?manufacturer=128&usage=165. The * says "match any character". So Disallow: /*? will match /<anything>?<more stuff> -- anything that has a question mark in it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With