Indexes auto creation

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Indexes auto creation

Stephane Bailliez
I have a very stupid question that puzzles me so far in the API. (I'm
using Lucene 1.4.3)

There is a boolean flag over the creation of the Directory which is
basically: use it as is or delete the storage area

Same for the index, the IndexWriter use a flag 'use the existing or
create a new one'.

If you're creating an indexwriter with 'create' set to false. It could
blow up with an IOException because the index does not exist.
But it could also blow up for other reasons with an IOException..which
does not help much in identifying the source problem.


What I would like to is something like: if the index does not exist,
then create one for me, otherwise use it.

I could do that with something like

try {
    writer = new IndexWriter(directory, analyzer, false)
} catch (IOException e){
     writer = new IndexWriter(directory, analyzer, true);
}

but this is not exactly true, and I could possibly delete an existing
index if an IOException happens which is not due to a non-existing index.

Apparently a way to check there is an existing index would be (based on
the Lucene source code) to do something like:

try {
    writer = new IndexWriter(directory, analyzer, false)
} catch (IOException e){
    if ( !directory.exists(IndexFileNames.SEGMENTS) ) {
        // the index really does not exists, so create it
        writer = new IndexWriter(directory, analyzer, true);
    } else {
        throw e;
    }
}

Is this correct or is there something even more simpler that I'm missing ?

Ideally I would have liked a subclassed IOException on the IndexWriter
to differentiate the cases (like FileNotFoundException for example) but
maybe I'm missing some trivial thing.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Indexes auto creation

Pasha Bizhan-2
Hi,

> From: news [mailto:[hidden email]] On Behalf Of Stephane Bailliez
>
> What I would like to is something like: if the index does not
> exist, then create one for me, otherwise use it.

Look at IndexReader.indexExists method.

Your code will be like this:

bool createIndex = ! (IndexReader.indexExists(directory));
writer = new IndexWriter(directory, analyzer, createIndex );

Pasha Bizhan
http://lucenedotnet.com 



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Indexes auto creation

Kadlabalu, Hareesh
In reply to this post by Stephane Bailliez
I ran into a related problem; when I create an IndexWriter with a
FSDirectory created with create=true, an existing index would somehow get
corrupted (Luke would come back with a message saying that the index is
corrupt). IndexWriter will tell you that it has 0 documents at that stage
even though the index had several documents prior to creating this instance
of IndexWriter. It is interesting that the sizes of the index files remain
same. It seems that, creating an IndexWriter with a FSDirectory with
create=true on an existing index somehow corrupts the index.

I figured there must be a bug here (I am using 1.4.3), has anyone run into
this? I had to fall back on an inelegant solution. First check if a
directory exists, if not create a dummy index and close it. After that
always create IndexWriter with create=false.

Thanks
-Hareesh




-----Original Message-----
From: Stephane Bailliez [mailto:[hidden email]]
Sent: Monday, June 13, 2005 11:48 AM
To: [hidden email]
Subject: Indexes auto creation

I have a very stupid question that puzzles me so far in the API. (I'm
using Lucene 1.4.3)

There is a boolean flag over the creation of the Directory which is
basically: use it as is or delete the storage area

Same for the index, the IndexWriter use a flag 'use the existing or
create a new one'.

If you're creating an indexwriter with 'create' set to false. It could
blow up with an IOException because the index does not exist.
But it could also blow up for other reasons with an IOException..which
does not help much in identifying the source problem.


What I would like to is something like: if the index does not exist,
then create one for me, otherwise use it.

I could do that with something like

try {
    writer = new IndexWriter(directory, analyzer, false)
} catch (IOException e){
     writer = new IndexWriter(directory, analyzer, true);
}

but this is not exactly true, and I could possibly delete an existing
index if an IOException happens which is not due to a non-existing index.

Apparently a way to check there is an existing index would be (based on
the Lucene source code) to do something like:

try {
    writer = new IndexWriter(directory, analyzer, false)
} catch (IOException e){
    if ( !directory.exists(IndexFileNames.SEGMENTS) ) {
        // the index really does not exists, so create it
        writer = new IndexWriter(directory, analyzer, true);
    } else {
        throw e;
    }
}

Is this correct or is there something even more simpler that I'm missing ?

Ideally I would have liked a subclassed IOException on the IndexWriter
to differentiate the cases (like FileNotFoundException for example) but
maybe I'm missing some trivial thing.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Indexes auto creation

Luke Francl
In reply to this post by Stephane Bailliez
You may want to try using IndexReader's indexExists family of methods.
They will tell you whether or not an index is there.

http://lucene.apache.org/java/docs/api/org/apache/lucene/index/IndexReader.html#indexExists(org.apache.lucene.store.Directory)



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Indexes auto creation

Stephane Bailliez
In reply to this post by Stephane Bailliez
Stephane Bailliez wrote:
[...]
> try {
>    writer = new IndexWriter(directory, analyzer, false)
> } catch (IOException e){
>     writer = new IndexWriter(directory, analyzer, true);
> }

On a related note, the code above does not work if the index does not
exist because of the lock created by the first IndexWriter that is still
lying around.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Indexes auto creation

Volodymyr Bychkoviak
In reply to this post by Stephane Bailliez
hello

I'm using following code in the startup of my program


    String indexDirectory = //some init
    try {
      if ( !IndexReader.indexExists(indexDirectory)) {
        // working index doesn't exist so try to create a dummy index.
        IndexWriter iw = new IndexWriter(indexDirectory, new
StandardAnalyzer(), true);
        iw.close();
      } else {
        IndexReader.unlock(FSDirectory.getDirectory(indexDirectory, false));
      }
    } catch (IOException e) {
      // Exception happened when trying to unlock working index
    }

regards,
Volodymyr Bychkoviak


Stephane Bailliez wrote:

> I have a very stupid question that puzzles me so far in the API. (I'm
> using Lucene 1.4.3)
>
> There is a boolean flag over the creation of the Directory which is
> basically: use it as is or delete the storage area
>
> Same for the index, the IndexWriter use a flag 'use the existing or
> create a new one'.
>
> If you're creating an indexwriter with 'create' set to false. It could
> blow up with an IOException because the index does not exist.
> But it could also blow up for other reasons with an IOException..which
> does not help much in identifying the source problem.
>
>
> What I would like to is something like: if the index does not exist,
> then create one for me, otherwise use it.
>
> I could do that with something like
>
> try {
>    writer = new IndexWriter(directory, analyzer, false)
> } catch (IOException e){
>     writer = new IndexWriter(directory, analyzer, true);
> }
>
> but this is not exactly true, and I could possibly delete an existing
> index if an IOException happens which is not due to a non-existing index.
>
> Apparently a way to check there is an existing index would be (based
> on the Lucene source code) to do something like:
>
> try {
>    writer = new IndexWriter(directory, analyzer, false)
> } catch (IOException e){
>    if ( !directory.exists(IndexFileNames.SEGMENTS) ) {
>        // the index really does not exists, so create it
>        writer = new IndexWriter(directory, analyzer, true);
>    } else {
>        throw e;
>    }
> }
>
> Is this correct or is there something even more simpler that I'm
> missing ?
>
> Ideally I would have liked a subclassed IOException on the IndexWriter
> to differentiate the cases (like FileNotFoundException for example)
> but maybe I'm missing some trivial thing.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Indexes auto creation

Daniel Naber
In reply to this post by Kadlabalu, Hareesh
On Monday 13 June 2005 18:37, Kadlabalu, Hareesh wrote:

> I ran into a related problem; when I create an IndexWriter with a
> FSDirectory created with create=true, an existing index would somehow
> get corrupted

Well, it doesn't get corrupted, it gets deleted. That's what create=true is
supposed to do, isn't it?

Regards
 Daniel

--
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Indexes auto creation

Stephane Bailliez
In reply to this post by Luke Francl
Luke Francl wrote:
> You may want to try using IndexReader's indexExists family of methods.
> They will tell you whether or not an index is there.
>
> http://lucene.apache.org/java/docs/api/org/apache/lucene/index/IndexReader.html#indexExists(org.apache.lucene.store.Directory)

Good grief ! I missed that one.

Thanks to all who have replied.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]