plug-ins

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

plug-ins

discoversk
Hello,
   
I wants to know about plugins available with nutch!!

1. architecture parser/plugins (any doc is there?? )
2. how to reuse plugins in other java applications
3. steps to write new plugin


Thanks
Reply | Threaded
Open this post in threaded view
|

Re: plug-ins

Guillermo Garrido
Hi,

Just refer to http://wiki.apache.org/nutch/PluginCentral

Regards,

guille

--
Guillermo Garrido Yuste

Dpto. Lenguajes y Sistemas Informáticos
E.T.S.I. Informática, UNED



discoversk wrote:

> Hello,
>    
> I wants to know about plugins available with nutch!!
>
> 1. architecture parser/plugins (any doc is there?? )
> 2. how to reuse plugins in other java applications
> 3. steps to write new plugin
>
>
> Thanks
>  
Reply | Threaded
Open this post in threaded view
|

Re: plug-ins

discoversk
Thanks Guillermo,

i have checked plugin-central wiki earlier also,
but i found that there are some differences (i may wrong here ) like html parser plugin gives me text data when i run using bin/nutch plugin command line, simmilarly if i use pdfparser, it gives me an error "main()" not found in class ...

after looking into source all plugins are using nutch classes like, Content,ParseData,OutLink etc...
is there any simple way to reuse nutch plugins with minimal effort ?? because i am not good java developer ...

Guillermo Garrido wrote
Hi,

Just refer to http://wiki.apache.org/nutch/PluginCentral

Regards,

guille

--
Guillermo Garrido Yuste

Dpto. Lenguajes y Sistemas Informáticos
E.T.S.I. Informática, UNED



discoversk wrote:
> Hello,
>    
> I wants to know about plugins available with nutch!!
>
> 1. architecture parser/plugins (any doc is there?? )
> 2. how to reuse plugins in other java applications
> 3. steps to write new plugin
>
>
> Thanks
>  
Reply | Threaded
Open this post in threaded view
|

Re: plug-ins

Guillermo Garrido
Hi,

I think your question would better directed to the users mail list: [hidden email].

Anyway, I the plugin architecture is made to work over the nutch search engine. Then, the parser plugins parse different file formats into text so it can be indexed in a Lucene index.

I'm not sure if I understand your needs correctly, but you might be looking at the wrong place. If you are interested in parsing libraries rather than searching you may want to check out Tika at http://lucene.apache.org/tika/

Regards,

-- 
Guillermo Garrido Yuste

Dpto. Lenguajes y Sistemas Informáticos
E.T.S.I. Informática, UNED


discoversk wrote:
Thanks Guillermo,

i have checked plugin-central wiki earlier also,
but i found that there are some differences (i may wrong here ) like html
parser plugin gives me text data when i run using bin/nutch plugin command
line, simmilarly if i use pdfparser, it gives me an error "main()" not found
in class ...

after looking into source all plugins are using nutch classes like,
Content,ParseData,OutLink etc...
is there any simple way to reuse nutch plugins with minimal effort ??
because i am not good java developer ...


Guillermo Garrido wrote:
  
Hi,

Just refer to http://wiki.apache.org/nutch/PluginCentral

Regards,

guille

-- 
Guillermo Garrido Yuste

Dpto. Lenguajes y Sistemas Informáticos
E.T.S.I. Informática, UNED



discoversk wrote:
    
Hello,
    
I wants to know about plugins available with nutch!!

1. architecture parser/plugins (any doc is there?? )
2. how to reuse plugins in other java applications
3. steps to write new plugin 


Thanks