On a custom PHP framework, I have implemented a mailing class that let's me know when a 404 occurs. It mails me the url, referrer and UA string.
I am getting two types of unexplained 404 reports for urls that are not linked anywhere on the site. This is happening quite often. I have tested on the exact browser versions as where the reports originate from. I can not find anything wrong, both in html as in javascript. These pages generally contain a only a little bit of javascript btw.
Type1 examples:
source: http://www.example.com/articles/example-article
target (404): http://www.example.com/articles/undefined
User agents who have reported this:
- Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36
(Chrome 30.0.1599.101 on win7)
- Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0; BOIE9;ENUSMSE)
(IE9 on win vista)
- Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; WOW64; Trident/6.0;WUID=78780BB80C56415F887179239977F107;WTB=6581)
(IE10 on win 8)
Type2 examples:
source: http://www.example.com/articles/example-article
target (404): http://www.example.com/articles
User agents who have reported this:
- Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET CLR 1.1.4322; .NET4.0C; .NET4.0E)
(IE8 on win7)
- Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; SV1; .NET CLR 1.1.4325; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30707; MS-RTC LM 8)
(IE7 on windows server 2003)
- Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.5; .NET CLR 1.1.4322 ; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)
(IE8 on winXP)
- Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; GTB7.5; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; SLCC1; .NET CLR 2.0.50727; .NET C LR 1.1.4322; .NET CLR 3.5.30729; .NET CLR 3.0.30618; .NET4.0C; HYVES)
(IE8 on win vista)
Could anyone help me explain these? Is there perhaps a buggy windows browser plugin that can be the cause? I have not seen any of these reports happening on other operating systems then windows. Allthough the sites do get quite a lot visits from other OS'es as well.
Cheers!
EDIT #1 I used useragentstring.com to explain the UA strings
EDIT #2 The answers of Palec, Fabio Beltramini and Artur have helped me further understand the issue and I feel they all contributed as much. Since I can only accept/reward one answer, I have chosen to accept Palec's answer because he answered first. Thank you all very much for thinking along. If I come across anything noteworthy during debuggin, I will add it here.
Possible explanations are:
<a href="">)Undefined in URL is a typical sign of broken JavaScript. Unwanted reference to (logical) folder containing current document is often caused by IE's infamous bug – it interprets empty path not as current document, but as . (containing folder), so empty link target works differently in IE and other browsers. Bot related errors are a story in itself – I could only add that it is not uncommon for them to make up both requested path and referer.
Two Stack Overflow questions supporting my conjecture about broken plugin:
Details on IE’s empty href bug (official resource linked there):
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With