LIUS/Fulltext indexing

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

LIUS/Fulltext indexing

Vish D.
Anyone have experience working with LIUS (
http://sourceforge.net/projects/lius/)? I can't seem to find any real
documentation on it, even though it seems 'active' @ sourceforge. I need a
way to index various types of fulltext, and LIUS seems very promising at
first glance. What do you guys think? Is there a similar implementation you
recommend, even something that might provide the simple text extraction
functionality for the various types? I figure, I would need to do that
anyways, and massage the text into Solr-type docs.

Vish
Reply | Threaded
Open this post in threaded view
|

Re: LIUS/Fulltext indexing

Yonik Seeley-2
On 6/11/07, Vish D. <[hidden email]> wrote:
> Anyone have experience working with LIUS (
> http://sourceforge.net/projects/lius/)? I can't seem to find any real
> documentation on it, even though it seems 'active' @ sourceforge. I need a
> way to index various types of fulltext, and LIUS seems very promising at
> first glance. What do you guys think? Is there a similar implementation you
> recommend, even something that might provide the simple text extraction
> functionality for the various types? I figure, I would need to do that
> anyways, and massage the text into Solr-type docs.

I think Tika will be the way forward (some of the code for Tika is
coming from LIUS)

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: LIUS/Fulltext indexing

Bertrand Delacretaz
On 6/12/07, Yonik Seeley <[hidden email]> wrote:

>... I think Tika will be the way forward (some of the code for Tika is
> coming from LIUS)...

Work has indeed started to incoroporate the Lius code into Tika, see
https://issues.apache.org/jira/browse/TIKA-7 and
http://incubator.apache.org/projects/tika.html

-Bertrand
Reply | Threaded
Open this post in threaded view
|

Re: LIUS/Fulltext indexing

Vish D.
Sounds interesting. I can't seem to find any clear dates on the project
website. Do you know? ...V1 shipping date?

Thanks!
On 6/12/07, Bertrand Delacretaz <[hidden email]> wrote:

>
> On 6/12/07, Yonik Seeley <[hidden email]> wrote:
>
> >... I think Tika will be the way forward (some of the code for Tika is
> > coming from LIUS)...
>
> Work has indeed started to incoroporate the Lius code into Tika, see
> https://issues.apache.org/jira/browse/TIKA-7 and
> http://incubator.apache.org/projects/tika.html
>
> -Bertrand
>
Reply | Threaded
Open this post in threaded view
|

Re: LIUS/Fulltext indexing

Bertrand Delacretaz
On 6/12/07, Vish D. <[hidden email]> wrote:
> ...Sounds interesting. I can't seem to find any clear dates on the project
> website. Do you know? ...V1 shipping date?...

Not at the moment, Tika just entered incubation and it's impossible to
predict what will happen.

But help is welcome, of course ;-)

-Bertrand
Reply | Threaded
Open this post in threaded view
|

Re: LIUS/Fulltext indexing

Vish D.
Wonder if TOM could be useful to integrate?
http://tom.library.upenn.edu/convert/sofar.html

On 6/12/07, Bertrand Delacretaz <[hidden email]> wrote:

>
> On 6/12/07, Vish D. <[hidden email]> wrote:
> > ...Sounds interesting. I can't seem to find any clear dates on the
> project
> > website. Do you know? ...V1 shipping date?...
>
> Not at the moment, Tika just entered incubation and it's impossible to
> predict what will happen.
>
> But help is welcome, of course ;-)
>
> -Bertrand
>
Reply | Threaded
Open this post in threaded view
|

Re: LIUS/Fulltext indexing

Bertrand Delacretaz
On 6/13/07, Vish D. <[hidden email]> wrote:
> ...Wonder if TOM could be useful to integrate?
> http://tom.library.upenn.edu/convert/sofar.html...

It might be interesting. and as I understand the goal of Tika is
mostly to be a framework for plugging in various types of analyzers.
So plugging in most any converter should hopefully be possible.

Anyway, people interested in discussing this are welcome at
[hidden email] (which is very quiet ATM, just
starting), or see http://incubator.apache.org/projects/tika.html

-Bertrand