Regarding Nutch Hadoop Cluster Setup in Deploy Mode

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Regarding Nutch Hadoop Cluster Setup in Deploy Mode

Dimanshu Parihar


Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
Hello Sir,
I have been using Nutch 1.17 in local mode and now I wanted to shift from local mode to deploy mode. For this, I tried the Apache Nutch Hadoop cluster setup link but I am stuck at the below given point :

Problem :

First copy the files from the nutch build to the deploy directory using something like the following command:

cp -R /path/to/build/* /nutch/search

Then make sure that all of the shell scripts are in unix format and are executable.

dos2unix /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch

chmod 700 /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch

dos2unix /nutch/search/config/*.sh

chmod 700 /nutch/search/config/*.sh
Issue :
The issue is I ran ant command in nutch folder and runtime folder is created and a build folder is created. I copied the build/* files to search folder that I created in nutch folder itself. But after running these dos2unix commands, it says no bin/Hadoop and bin/nutch files found here which is obvious because my build folder didn’t had these files.
So can you please clarify these statements that how can I follow these steps?
I have only 1 user where I am setting all 3 hadoop, solr and nutch which is not root user.
Reply | Threaded
Open this post in threaded view
|

Re: Regarding Nutch Hadoop Cluster Setup in Deploy Mode

Sebastian Nagel-2
Hi,

Nutch does not include a search component anymore. These steps are obsolete.

All you need is to setup your Hadoop cluster, then run
   $NUTCH_HOME/runtime/deploy/bin/nutch ...
(instead of .../runtime/local/bin/nutch ...)

Alternatively, you could launch a Nutch tool, eg. Injector
the following way:

hadoop jar $NUTCH_HOME/runtime/deploy/apache-nutch-1.15-SNAPSHOT.job \
   org.apache.nutch.crawl.Injector ...

Best,
Sebastian


On 8/10/20 11:31 AM, Dimanshu Parihar wrote:

>
>
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
> Hello Sir,
> I have been using Nutch 1.17 in local mode and now I wanted to shift from local mode to deploy mode. For this, I tried the Apache Nutch Hadoop cluster setup link but I am stuck at the below given point :
>
> Problem :
>
> First copy the files from the nutch build to the deploy directory using something like the following command:
>
> cp -R /path/to/build/* /nutch/search
>
> Then make sure that all of the shell scripts are in unix format and are executable.
>
> dos2unix /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
>
> chmod 700 /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
>
> dos2unix /nutch/search/config/*.sh
>
> chmod 700 /nutch/search/config/*.sh
> Issue :
> The issue is I ran ant command in nutch folder and runtime folder is created and a build folder is created. I copied the build/* files to search folder that I created in nutch folder itself. But after running these dos2unix commands, it says no bin/Hadoop and bin/nutch files found here which is obvious because my build folder didn’t had these files.
> So can you please clarify these statements that how can I follow these steps?
> I have only 1 user where I am setting all 3 hadoop, solr and nutch which is not root user.
>

Reply | Threaded
Open this post in threaded view
|

RE: Regarding Nutch Hadoop Cluster Setup in Deploy Mode

Dimanshu Parihar
Thanks Sebastian,
This helps a lot. I got the point. They should change the documentation. A lot of people gets confused because of that.

Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10

From: Sebastian Nagel<mailto:[hidden email]>
Sent: Tuesday, August 11, 2020 4:56 PM
To: [hidden email]<mailto:[hidden email]>
Subject: Re: Regarding Nutch Hadoop Cluster Setup in Deploy Mode

Hi,

Nutch does not include a search component anymore. These steps are obsolete.

All you need is to setup your Hadoop cluster, then run
   $NUTCH_HOME/runtime/deploy/bin/nutch ...
(instead of .../runtime/local/bin/nutch ...)

Alternatively, you could launch a Nutch tool, eg. Injector
the following way:

hadoop jar $NUTCH_HOME/runtime/deploy/apache-nutch-1.15-SNAPSHOT.job \
   org.apache.nutch.crawl.Injector ...

Best,
Sebastian


On 8/10/20 11:31 AM, Dimanshu Parihar wrote:

>
>
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
> Hello Sir,
> I have been using Nutch 1.17 in local mode and now I wanted to shift from local mode to deploy mode. For this, I tried the Apache Nutch Hadoop cluster setup link but I am stuck at the below given point :
>
> Problem :
>
> First copy the files from the nutch build to the deploy directory using something like the following command:
>
> cp -R /path/to/build/* /nutch/search
>
> Then make sure that all of the shell scripts are in unix format and are executable.
>
> dos2unix /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
>
> chmod 700 /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
>
> dos2unix /nutch/search/config/*.sh
>
> chmod 700 /nutch/search/config/*.sh
> Issue :
> The issue is I ran ant command in nutch folder and runtime folder is created and a build folder is created. I copied the build/* files to search folder that I created in nutch folder itself. But after running these dos2unix commands, it says no bin/Hadoop and bin/nutch files found here which is obvious because my build folder didn’t had these files.
> So can you please clarify these statements that how can I follow these steps?
> I have only 1 user where I am setting all 3 hadoop, solr and nutch which is not root user.
>

Reply | Threaded
Open this post in threaded view
|

Re: Regarding Nutch Hadoop Cluster Setup in Deploy Mode

Sebastian Nagel-3
Hi Dimanshu,

Nutch is a community project. If you can, please take the time, be part of the community
and improve the documentation. Unlike for the source code, the barrier for the wiki is low:
anybody can and *is welcome* to register and update the Nutch Wiki. As a 100% volunteer project
we rely on contributions from the community including our users.

Thanks,
Sebastian

On 9/4/20 9:17 PM, Dimanshu Parihar wrote:

> Thanks Sebastian,
> This helps a lot. I got the point. They should change the documentation. A lot of people gets confused because of that.
>
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
>
> From: Sebastian Nagel<mailto:[hidden email]>
> Sent: Tuesday, August 11, 2020 4:56 PM
> To: [hidden email]<mailto:[hidden email]>
> Subject: Re: Regarding Nutch Hadoop Cluster Setup in Deploy Mode
>
> Hi,
>
> Nutch does not include a search component anymore. These steps are obsolete.
>
> All you need is to setup your Hadoop cluster, then run
>    $NUTCH_HOME/runtime/deploy/bin/nutch ...
> (instead of .../runtime/local/bin/nutch ...)
>
> Alternatively, you could launch a Nutch tool, eg. Injector
> the following way:
>
> hadoop jar $NUTCH_HOME/runtime/deploy/apache-nutch-1.15-SNAPSHOT.job \
>    org.apache.nutch.crawl.Injector ...
>
> Best,
> Sebastian
>
>
> On 8/10/20 11:31 AM, Dimanshu Parihar wrote:
>>
>>
>> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
>> Hello Sir,
>> I have been using Nutch 1.17 in local mode and now I wanted to shift from local mode to deploy mode. For this, I tried the Apache Nutch Hadoop cluster setup link but I am stuck at the below given point :
>>
>> Problem :
>>
>> First copy the files from the nutch build to the deploy directory using something like the following command:
>>
>> cp -R /path/to/build/* /nutch/search
>>
>> Then make sure that all of the shell scripts are in unix format and are executable.
>>
>> dos2unix /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
>>
>> chmod 700 /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
>>
>> dos2unix /nutch/search/config/*.sh
>>
>> chmod 700 /nutch/search/config/*.sh
>> Issue :
>> The issue is I ran ant command in nutch folder and runtime folder is created and a build folder is created. I copied the build/* files to search folder that I created in nutch folder itself. But after running these dos2unix commands, it says no bin/Hadoop and bin/nutch files found here which is obvious because my build folder didn’t had these files.
>> So can you please clarify these statements that how can I follow these steps?
>> I have only 1 user where I am setting all 3 hadoop, solr and nutch which is not root user.
>>
>
>