> i am working with nutch-0.8.1 and i am trying configure hadoop but my
> questions are:
> -in the directory bin exist the files:
> hadoop, hadoop-daemon, hadoop-daemons, nutch, rcc, slaves, start-all,
> start-dfs, start-mapred, stop-all, stop-dfs, stop-mapred
> this files are necesary for run nutch with hadoop?
> my base is
> http://wiki.apache.org/nutch/NutchHadoopTutorial >
> or i have download hadoop and make install?
Everything you need to run hadoop with nutch is in the
nutch download, at least it is with nutch 0.9. The
item you list above in the bin directory are all the
same ones I used to get hadoop going.
Make sure you follow all the directions in the tutorial.
There are also several others that basically say the same
thing, so the instructions are good.
Make sure you understand your configuration files and
what you are setting.
You need to add your public key to the .ssh/authorized_keys on the master as
well as the slave. Also, make sure that this file is not writable by anyone
else but you.
On Thursday 07 February 2008, payo wrote:
> i created my ssh keys and i can login over ssh without being prompted for a
> password on the slave node
> but when i execute on master node
> showme this
> [user@emcvaalkm01 search]# ./bin/start-all.sh
> starting namenode, logging to
> user@localhost's password:
> what is the problem
i can entry on master-slave and slave master via ssh withaout password but when execute start-all.sh
[nutch@emcvaalkm01 search]$ ./bin/start-all.sh
namenode running as process 23240. Stop it first.
localhost: starting datanode, logging to /nutch-0.8.1/search/logs/hadoop-nutch-datanode-emcvaalkm01.estafeta.com.out
jobtracker running as process 23277. Stop it first.
localhost: starting tasktracker, logging to /nutch-0.8.1/search/logs/hadoop-nutch-tasktracker-emcvaalkm01.estafeta.com.out
but now showme this
[nutch@emcvaalkm01 search]$ ./bin/nutch crawl urls -dir crawled -depth 3
crawl started in: crawled
rootUrlDir = urls
threads = 10
depth = 3
Injector: crawlDb: crawled/crawldb
Injector: urlDir: urls
Injector: Converting injected urls to crawl db entries.
Exception in thread "main" org.apache.hadoop.ipc.RemoteException: java.io.IOException: failed to create file /user/nutch/$/nutch-0.8.1/filesystem/mapreduce/system/submit_byrgr6/.job.jar.crc on client emcvaalkm01.estafeta.com because target-length is 0, below MIN_REPLICATION (1)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)