lucene indexing

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

lucene indexing

trupti mulajkar

hi
can anyone suggest how to split files using lucene.
i am trying to index the TREC collection using lucene-1.4.3
i want lucene to read the multiple files within single TREC file and create an
index accordingly.

cheers,
trupti mulajkar
MSc Advanced Computer Science




---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: lucene indexing

Grant Ingersoll
Lucene does not provide this out of the box.  You will have to write a
program to do it and feed the results to Lucene.

If I remember right, these files are in XML, so you can probably use SAX
or a pull parser.

I think a number of TREC participants, in the past, have used Lucene, so
you may be able to find someone on the web who is generous enough to
have shared their implementation.

trupti mulajkar wrote:

> hi
> can anyone suggest how to split files using lucene.
> i am trying to index the TREC collection using lucene-1.4.3
> i want lucene to read the multiple files within single TREC file and create an
> index accordingly.
>
> cheers,
> trupti mulajkar
> MSc Advanced Computer Science
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
>  

--

Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
School of Information Studies
335 Hinds Hall
Syracuse, NY 13244

http://www.cnlp.org 
Voice:  315-443-5484
Fax: 315-443-6886


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]