I am currently doing a research project and I am attempting to figure out a good way to identify ads given access to the html of a webpage.
I thought it might be a good idea to start with AdBlock. AdBlock is a program that prevents ads from being displayed to the user, so presumably it has a mechanism for identifying things as ads.
I downloaded the source code for AdBlockPlus, but I find myself completely lost in all of the files. I am not sure where to start looking for this detection mechanism, so I was wondering if anyone had any advice on where to start. Alternatively if you have dealt with AdBlock before and are familiar with it, I would appreciate any extra information.
For example, if the webpage needs to be rendered in a real browser to use Adblock, there are programs that will automate the loading of a webpage so this wouldn't be a problem but I am not sure how to figure out if this is what AdBlock does in the first place.
Note: AdBlock is written in Python and Perl :)
Thanks!
I would advise you to first have a look at writing adblock filter rules.
Then, once you get an idea of this, you can start parsing adblock lists available in various languages to suit your needs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With