Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Experience with web crawlers on heroku

Does anybody have experience coding web crawlers with gems such as anemone and deploying them to heroku for your own person use? Would such a continuously running programs violate any of heroku's TOA/TOS?

like image 457
Jackson Henley Avatar asked Oct 31 '25 00:10

Jackson Henley


1 Answers

Not any more.

Heroku Acceptable Use Policy states in Prohibited Actions p.21 that crawler must

  • identify itself via a unique User Agent
  • obey robots.txt (including crawl-delay directive)
  • from p.20 stems the requirement not use you crawler as an "open proxy"

NB! A free instance must not exceed 18 hours of work a day.

like image 99
berezovskyi Avatar answered Nov 02 '25 21:11

berezovskyi