Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to crowd source my web crawling

My web application requires downloading content from the user URL specified. Currently this request go through my server, which is inefficient and could get my server IP blocked.

Is there a way to let the user download the URL content directly? The same-origin policy seems to prevent using AJAX or an iframe to download and reuse this content.

Any ideas? For example is there a way via flash to download and reuse URL content?

like image 820
hoju Avatar asked Dec 05 '25 14:12

hoju


1 Answers

You could use Tor to mask your requests, but if you're having to go such lengths to crawl a website perhaps you shouldn't be doing it?

Also, with your approach the iframe request will include your page URL as the referrer, which makes identifying these requests at the server end pretty straightforward...

like image 192
Paul Dixon Avatar answered Dec 08 '25 05:12

Paul Dixon



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!