Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

bin/nutch inject crawl/crawldb urls not working

Tags:

apache

nutch

I just followed the tutorial to setup Nutch from NutchWiki.

Downloaded Nutch 2.x src and set all configurations. The problem occurs when I just started to crawl. When I run this code : bin/nutch inject crawl/crawldb urls I am getting an error message like this : Unrecognized arg urls I just followed all steps in the tutorial, created directories, made changes to configuration files etc. And I also have a query that there is no crawldb directory in the apache-nutch-2.x/runtime/local/ Is it automatically generated or need to manually generate it ? Any help to this problem will be appreciated.

like image 529
Abhishek Ramachandran Avatar asked Jan 23 '26 20:01

Abhishek Ramachandran


1 Answers

I was going through the same problem. The documentation seems to be outdated. It is for 1.x .

For 2.x I have tried the following and it worked for me.

bin/nutch inject urls

Hope it helps.

like image 194
Mohammed Hashim Avatar answered Jan 25 '26 12:01

Mohammed Hashim