Nutch & Cluster

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Nutch & Cluster

Francesc Bruguera
Hi,
I want to run Nutch in a cluster of machines. What software do I have to use?

How I can do that?



Reply | Threaded
Open this post in threaded view
|

RE: Nutch & Cluster

Webmaster-271
Hi,

Follow the Nutch-Hadoop tutorials on the nutch website. First make sure you know how to install java properly and then try to get hadoop running on one machine.

If you are successful then install to a second machine as per the tutorial and start the hadoop cluster.  From there it is very easr to repeat the proces over and over for each additional node.

Axel..

-----Original Message-----
From: Francesc Bruguera [mailto:[hidden email]]
Sent: Sunday, October 26, 2008 10:39 AM
To: [hidden email]
Subject: Nutch & Cluster

Hi,
I want to run Nutch in a cluster of machines. What software do I have to use?

How I can do that?



     

Reply | Threaded
Open this post in threaded view
|

Re: Nutch & Cluster

brainstorm-2-2
In reply to this post by Francesc Bruguera
Hi Francesc,

There are many tutorials out there, this is the one that was most useful to me:

http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29

I use rocksclusters.org as the infraestructure.

If you happen to run Solaris, you can try out this one:

http://blogs.sun.com/george/entry/creating_a_virtual_hadoop_cluster

I si vols suport en català, et puc donar un cop de mà ;)

Saluts,
Roman


On Sun, Oct 26, 2008 at 6:39 PM, Francesc Bruguera <[hidden email]> wrote:
> Hi,
> I want to run Nutch in a cluster of machines. What software do I have to use?
>
> How I can do that?
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Nutch & Cluster

Francesc Bruguera
In reply to this post by Francesc Bruguera
My problem is that, I have various servers.
One in Spain, another in Japan, and 2 in the USA (diferent DC)

Can I use the clusering function?





________________________________
De: brainstorm <[hidden email]>
Para: [hidden email]
CC: [hidden email]
Enviado: lunes, 27 de octubre, 2008 11:25:44
Asunto: Re: Nutch & Cluster

Hi Francesc,

There are many tutorials out there, this is the one that was most useful to me:

http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29

I use rocksclusters.org as the infraestructure.

If you happen to run Solaris, you can try out this one:

http://blogs.sun.com/george/entry/creating_a_virtual_hadoop_cluster

I si vols suport en català, et puc donar un cop de mà ;)

Saluts,
Roman


On Sun, Oct 26, 2008 at 6:39 PM, Francesc Bruguera <[hidden email]> wrote:
> Hi,
> I want to run Nutch in a cluster of machines. What software do I have to use?
>
> How I can do that?
>
>
>
>



Reply | Threaded
Open this post in threaded view
|

Re: Nutch & Cluster

Francesc Bruguera
In reply to this post by Francesc Bruguera
----- Mensaje reenviado ----

De: Francesc Bruguera <[hidden email]>
Para: [hidden email]
Enviado: lunes, 27 de octubre, 2008 18:38:06
Asunto: Re: Nutch & Cluster


My problem is that, I have various servers.
One in Spain, another in Japan, and 2 in the USA (diferent DC)

Can I use the clusering function?





________________________________
De: brainstorm <[hidden email]>
Para: [hidden email]
CC: [hidden email]
Enviado: lunes, 27 de octubre, 2008 11:25:44
Asunto: Re: Nutch & Cluster

Hi Francesc,

There are many tutorials out there, this is the one that was most useful to me:

http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29

I use rocksclusters.org as the infraestructure.

If you happen to run Solaris, you can try out this one:

http://blogs.sun.com/george/entry/creating_a_virtual_hadoop_cluster

I si vols suport en català, et puc donar un cop de mà ;)

Saluts,
Roman


On Sun, Oct 26, 2008 at 6:39 PM, Francesc Bruguera <[hidden email]> wrote:
> Hi,
> I want to run Nutch in a cluster of machines. What software do I have to use?
>
> How I can do that?
>
>
>
>


Reply | Threaded
Open this post in threaded view
|

Nutch merging options?

Alex Basa
Hi,

Does anyone have any tips on how to speed up merges?  Also, is there a way to do incremental merges?  If I have 10 indexes, is the fastest way to merge them into one big index just using mergecrawls.sh and merge one at a time?

Thanks in advance,

Alex


     
Reply | Threaded
Open this post in threaded view
|

Re: Nutch merging options?

Jianheng Qiu
Hi Alex:

As far as I know, if you delete the segment sub folders every time after
index, the index should be incremental.




On Mon, Nov 3, 2008 at 11:58 PM, Alex Basa <[hidden email]> wrote:

> Hi,
>
> Does anyone have any tips on how to speed up merges?  Also, is there a way
> to do incremental merges?  If I have 10 indexes, is the fastest way to merge
> them into one big index just using mergecrawls.sh and merge one at a time?
>
> Thanks in advance,
>
> Alex
>
>
>
>


--
Best Regards,
Jianheng Qiu
[hidden email]