Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to identify ads on the website

Tags:

html

ads

analysis

I'd like to programatically analyze the content of a website and find possible spots where ads might be placed (or the ads themselves). Different websites may have the ads from different vendors placed in many different formats and I'd like my solution to pick as many of them as possible.

How would you programatically solve this problem. So far I have found only one solution but I'm not very happy about it (the reason below).

The obvious solution would be to do a serious of regex searches on a source code looking for ad-engine specific JS and/or HTML. I belive this is something similar to what AdBlock uses to strip ads from websites in browser. However since there's so many ad engines this wouldn't be neither effective nor easy to maintain (even if we consider using AdBlock black lists to feed the search engine).

I'd like to find a more generic solution to this problem and I'm not necessarily looking for a final solution. Different views on the problem will be helpful.

like image 330
RaYell Avatar asked Dec 06 '25 05:12

RaYell


1 Answers

I don't think maintaining a list of ad vendors it that difficult, especially given that there are only a few major players who serve up 90%+ of all ads.

If you're not looking for a catch-all solution, detecting 90%+ would be an accepable hit rate I'd say.

Doing it 'heuristically', you could simply flag up any Flash or similar media objects served from a domain which is different to that on which the hosting page resides?

like image 170
Widor Avatar answered Dec 09 '25 19:12

Widor



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!