Updating an index??

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Updating an index??

chaiguy1337
Hi all. I'm new to Lucene, reading Lucene in Action, and using Lucene.NET, but my question is not platform specific.

I'm baffled about the "create" parameter of the IndexWriter/IndexModifier constructor. It seems the only two options are overwrite and fail. I would like to append a not-yet-existing database each time I open the IndexWriter. In other words, the first time the user runs my program, the index is clearly not going to exist, but every successive time the method is called, I want it to append the index, not overwrite it!

It seems to me the only possible way this design could work is if it were also coupled with some way to determine if the index already exists.

Am I totally missing something? Is the append option even supported? Perhaps I'm expected to create a new index each time, for every single document (my documents are indexed one at a time because they are indexed as soon as they are created) and then merge them into the main index? That seems silly when a single operation could take care of everything.

Some light shed on this would be appreciated.
A. Logan Murray
http://pihole.org/
Reply | Threaded
Open this post in threaded view
|

Re: Updating an index??

Michael McCandless-2

The latest versions of Lucene (java) have a constructor for  
IndexWriter that does not take a boolean create argument, and simply  
opens for append if the index is already present else creates the index.

I don't remember exactly which version this was added in, but it was a  
good while ago.

Mike

chaiguy1337 wrote:

>
> Hi all. I'm new to Lucene, reading Lucene in Action, and using  
> Lucene.NET,
> but my question is not platform specific.
>
> I'm baffled about the "create" parameter of the IndexWriter/
> IndexModifier
> constructor. It seems the only two options are overwrite and fail. I  
> would
> like to append a not-yet-existing database each time I open the  
> IndexWriter.
> In other words, the first time the user runs my program, the index is
> clearly not going to exist, but every successive time the method is  
> called,
> I want it to append the index, not overwrite it!
>
> It seems to me the only possible way this design could work is if it  
> were
> also coupled with some way to determine if the index already exists.
>
> Am I totally missing something? Is the append option even supported?  
> Perhaps
> I'm expected to create a new index each time, for every single  
> document (my
> documents are indexed one at a time because they are indexed as soon  
> as they
> are created) and then merge them into the main index? That seems  
> silly when
> a single operation could take care of everything.
>
> Some light shed on this would be appreciated.
> --
> View this message in context: http://www.nabble.com/Updating-an-index---tp19544691p19544691.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>

Reply | Threaded
Open this post in threaded view
|

Re: Updating an index??

chaiguy1337
Ah I see. Hopefully that will make it into Lucene.NET eventually (it's using the 2.0 codebase right now), thought it hasn't been updated in a while.

As a followup, I did manage to get it working alright by checking for the presence of a "segments" file in the directory. Is this reliable? Is there a recommended way to tell if there is already an index present in the directory? I also create an empty index at startup if the segments file is not found, so that my method that creates an IndexWriter normally for updating can pass 'false' for create, and "segments" seems to be the only file created when there are no documents in the index.

Logan

Michael McCandless-2 wrote
The latest versions of Lucene (java) have a constructor for  
IndexWriter that does not take a boolean create argument, and simply  
opens for append if the index is already present else creates the index.

I don't remember exactly which version this was added in, but it was a  
good while ago.
A. Logan Murray
http://pihole.org/
Reply | Threaded
Open this post in threaded view
|

Re: Updating an index??

Michael McCandless-2

It's better to use the static IndexReader.indexExists(Directory)  
method, since the file "segments" has actually changed to "segments_N"  
in recent Lucene releases.  In general the specific naming of files in  
Lucene's index directory can change from release to release.

Mike

chaiguy1337 wrote:

>
> Ah I see. Hopefully that will make it into Lucene.NET eventually  
> (it's using
> the 2.0 codebase right now), thought it hasn't been updated in a  
> while.
>
> As a followup, I did manage to get it working alright by checking  
> for the
> presence of a "segments" file in the directory. Is this reliable? Is  
> there a
> recommended way to tell if there is already an index present in the
> directory? I also create an empty index at startup if the segments  
> file is
> not found, so that my method that creates an IndexWriter normally for
> updating can pass 'false' for create, and "segments" seems to be the  
> only
> file created when there are no documents in the index.
>
> Logan
>
>
> Michael McCandless-2 wrote:
>>
>> The latest versions of Lucene (java) have a constructor for
>> IndexWriter that does not take a boolean create argument, and simply
>> opens for append if the index is already present else creates the  
>> index.
>>
>> I don't remember exactly which version this was added in, but it  
>> was a
>> good while ago.
>>
>
> --
> View this message in context: http://www.nabble.com/Updating-an-index---tp19544691p19554571.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>