using nutch parsers/analyzers in a separate application

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

using nutch parsers/analyzers in a separate application

Stephane Nicoll
Hi there,

I am working on a Spring/Hibernate application that is meant to catalog all
kinds of data. We are already using Lucene and Hibernate Search for quite
basic use cases (only a few fields are indexed, we are not using any stemmer
or custom analyzer).

I am basically investigating side projects to understand how we can improve
this. One use case we will have soon is the ability to index pdfs, word
documents, etc. Nutch seems to do this pretty well but I am but confused. Is
there a way to reuse the indexing capability of nutch outside of the
crawling framework? Is there a way to integrate the search capabilities in
an existing Spring/Hibernate (search) application?

Any input is appreciated.


Large Systems Suck: This rule is 100% transitive. If you build one, you
suck" -- S.Yegge