Nutch API

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Nutch API

Daniele Menozzi
Hi all,I'm interested in nutch project, and it seems pretty good,but there
are a few things I've not understood:

- can nutch's functions (crawling,indexing,etc) be called from an external
  program written in java,or I have always to use a shell script?
  If so, where can I find informations on APIs?
- Ho many pages can nutch actually manage? Is there a limit?
- nutch === lucene + crawling ?
- why you do not use a database like mysql to store the data?

Thank you so much :))
        Menoz

--
                      Free Software Enthusiast
                 Debian Powered Linux User #332564
                     http://menoz.homelinux.org
Reply | Threaded
Open this post in threaded view
|

Re: Nutch API

Fredrik Andersson-2-2
Hello Daniele!

>can nutch's functions (crawling,indexing,etc) be called from an external
> program written in java

Yes. Look at the bin/nutch script and you will see the entry points in the
Java classes.

> If so, where can I find informations on APIs?

http://lucene.apache.org/java/docs/api/index.html
http://lucene.apache.org/nutch/apidocs/index.html

> Ho many pages can nutch actually manage? Is there a limit?

Nope. The filesystem sets the limit, and it's totally pluggable.

Greets,
Fredrik


On 9/12/05, Daniele Menozzi <[hidden email]> wrote:

>
> Hi all,I'm interested in nutch project, and it seems pretty good,but there
> are a few things I've not understood:
>
> - can nutch's functions (crawling,indexing,etc) be called from an external
> program written in java,or I have always to use a shell script?
> If so, where can I find informations on APIs?
> - Ho many pages can nutch actually manage? Is there a limit?
> - nutch === lucene + crawling ?
> - why you do not use a database like mysql to store the data?
>
> Thank you so much :))
> Menoz
>
> --
> Free Software Enthusiast
> Debian Powered Linux User #332564
> http://menoz.homelinux.org
>
Reply | Threaded
Open this post in threaded view
|

Re: Nutch API

Daniele Menozzi
On  19:29:45 12/Sep , Fredrik Andersson wrote:
> Hello Daniele!

Hi!

> Yes. Look at the bin/nutch script and you will see the entry points in the
> Java classes.

oh,ok,thank you

> > Ho many pages can nutch actually manage? Is there a limit?
>
> Nope. The filesystem sets the limit, and it's totally pluggable.

you mean that the limit is the file size the filesistem can support?
Which FS is reccomended? XFS?


Thank you so much :))
 Menoz


--
                      Free Software Enthusiast
                 Debian Powered Linux User #332564
                     http://menoz.homelinux.org