I want to code a perl application that would crawl some websites and collect images and links from such webpages. Because the most of pages use JavaScript that generate a HTML content, I need to code quasi a client browser with JavaScript support to be able to parse a final HTML code that is generated and/or modified by JavaScript. What are my options?
If possible, please publish some implementation code or link to some example(s).
There are several options.
Options that spring to mind:
You could have Perl use Selenium and have a full-blown browser do the work for you.
You can download and compile V8 or another open source JavaScript engine and have Perl call an external program to evaluate the JavaScript.
I don't think Perl's LWP module supports JavaScript, but you might want to check that if you haven't done so already.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With