Do nutch help me?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Do nutch help me?

Arun Kumar Sharma
Hi  All,
 
I want to know how nutch fits into my requirements and how best I can expolit its features?
 
Requirement:
 
Nutch is designed to be crawl the information system on internet and intranet. My requirement is that it crawl information present anywhere? Do nutch suitable for me ? What I need to extend/ update so that it fits to my requirement ?
 
response awaited........

 

Thanx in advance for your early response




Regards,
 
Arun Kumar Sharma (Tech Lead -Java/J2EE)
Mob: +91.981.529.5761




               
---------------------------------
 Enjoy this Diwali with Y! India Click here
Reply | Threaded
Open this post in threaded view
|

Re: Do nutch help me?

Stefan Groschupf-2
Yes, nutch can crawl webpages and you can soemhow limit the crawler  
to a set of hosts.
Just try the intranet crawl tutorial to get an idea.
Stefan
Am 10.11.2005 um 10:04 schrieb Arun Kumar Sharma:

> Hi  All,
>
> I want to know how nutch fits into my requirements and how best I  
> can expolit its features?
>
> Requirement:
>
> Nutch is designed to be crawl the information system on internet  
> and intranet. My requirement is that it crawl information present  
> anywhere? Do nutch suitable for me ? What I need to extend/ update  
> so that it fits to my requirement ?
>
> response awaited........
>
>
>
> Thanx in advance for your early response
>
>
>
>
> Regards,
>
> Arun Kumar Sharma (Tech Lead -Java/J2EE)
> Mob: +91.981.529.5761
>
>
>
>
>
> ---------------------------------
>  Enjoy this Diwali with Y! India Click here

Reply | Threaded
Open this post in threaded view
|

Re: Do nutch help me?

Arun Sharma-3
Hi
 I want to crawl local files, internet/intranet documents/files. Do u think
nutch help me in this case?
Do I need some additions/extension in the functionality of nutch?

 On 11/10/05, Stefan Groschupf <[hidden email]> wrote:

>
> Yes, nutch can crawl webpages and you can soemhow limit the crawler
> to a set of hosts.
> Just try the intranet crawl tutorial to get an idea.
> Stefan
> Am 10.11.2005 um 10:04 schrieb Arun Kumar Sharma:
>
> > Hi All,
> >
> > I want to know how nutch fits into my requirements and how best I
> > can expolit its features?
> >
> > Requirement:
> >
> > Nutch is designed to be crawl the information system on internet
> > and intranet. My requirement is that it crawl information present
> > anywhere? Do nutch suitable for me ? What I need to extend/ update
> > so that it fits to my requirement ?
> >
> > response awaited........
> >
> >
> >
> > Thanx in advance for your early response
> >
> >
> >
> >
> > Regards,
> >
> > Arun Kumar Sharma (Tech Lead -Java/J2EE)
> > Mob: +91.981.529.5761
> >
> >
> >
> >
> >
> > ---------------------------------
> > Enjoy this Diwali with Y! India Click here
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Do nutch help me?

Paul E. Baclace
Arun Kaundal wrote:
> Hi
>  I want to crawl local files, internet/intranet documents/files. Do u think
> nutch help me in this case?

Although the tutorial describes these separately,
conf/crawl-urlfilter.txt can allow any combination of
Internet, Intranet, and local filesystem crawling.
Reply | Threaded
Open this post in threaded view
|

Re: Do nutch help me?

Arun Sharma-3
Is it possible for u provide me link for that tutorial? How I can modify the
conf/crawl-urlfilter.txt file to allow local filesystem crwaling ?


 On 11/11/05, Paul Baclace <[hidden email]> wrote:

>
> Arun Kaundal wrote:
> > Hi
> > I want to crawl local files, internet/intranet documents/files. Do u
> think
> > nutch help me in this case?
>
> Although the tutorial describes these separately,
> conf/crawl-urlfilter.txt can allow any combination of
> Internet, Intranet, and local filesystem crawling.
>