Issue with re-opening IndexWriter

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Issue with re-opening IndexWriter

Vitaly Stroutchkov
Hello,

We are using Lucene 6.4.2 with a file-based index, Oracle JDK 1.8.0_121, Windows 10.
We found that the following steps generate unrecoverable error (we have to restart our JVM in order to resume normal work which is not acceptable):

  1.  Create an index directory, open IndexWriter, generate the index data and close the writer.
  2.  Perform search on the index. It will open DirectoryReader and we leave it open so it can be re-used.
  3.  We run nightly job that will regenerate the index because some data there is time-sensitive. When the job is trying to open an IndexWriter again we are getting the following error:
java.lang.IllegalArgumentException: Directory MMapDirectory@C:\...\INDEXNAME lockFactory=org.apache.lucene.store.NativeFSLockFactory@10159324 still has pending deleted files; cannot initialize IndexWriter
    at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:795)

Our understanding of the problem is that the IndexWriter is waiting for some old files (like _*.cfs) to be deleted but the files are still locked by operation system because the directory reader is still open. We found by debugging FSDirectory class code that all attempts to delete those files fail with java.nio.file.AccessDeniedException that is not re-thrown by the code. It creates a deadlock that can be only resolved by restarting java VM.
We found a workaround (by overriding FSDirectory. checkPendingDeletions() to return 'false' always and opening new reader after re-indexing so the old reader is getting closed and the old files are getting unlocked and deleted eventually) but I think it's a serious problem and it has to be resolved by Lucene developers.

Regards,
Vitaly Stroutchkov
Senior software developer at Kloudville Inc.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Issue with re-opening IndexWriter

Michael McCandless-2
Hi,

This is a known windows specific issue when you close IndexWriter while
IndexReaders are still open, because windows prevents deletion of
still-open-for-reading files.

Can you close all open IndexReaders before closing the first IndexWriter?
This way IndexWriter will be able to delete the files.

Or, better, can you keep your single IndexWriter open, and then when you do
your nightly full reindex, use IndexWriter.deleteAll to delete all docs and
then index your docs again?

Failing those two options, you can simply create a new FSDirectory instance
and open your 2nd IndexWriter using that.

Mike McCandless

http://blog.mikemccandless.com

On Mon, Mar 13, 2017 at 3:04 PM, Vitaly Stroutchkov <
[hidden email]> wrote:

> Hello,
>
> We are using Lucene 6.4.2 with a file-based index, Oracle JDK 1.8.0_121,
> Windows 10.
> We found that the following steps generate unrecoverable error (we have to
> restart our JVM in order to resume normal work which is not acceptable):
>
>   1.  Create an index directory, open IndexWriter, generate the index data
> and close the writer.
>   2.  Perform search on the index. It will open DirectoryReader and we
> leave it open so it can be re-used.
>   3.  We run nightly job that will regenerate the index because some data
> there is time-sensitive. When the job is trying to open an IndexWriter
> again we are getting the following error:
> java.lang.IllegalArgumentException: Directory MMapDirectory@C:\...\INDEXNAME
> lockFactory=org.apache.lucene.store.NativeFSLockFactory@10159324 still
> has pending deleted files; cannot initialize IndexWriter
>     at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:795)
>
> Our understanding of the problem is that the IndexWriter is waiting for
> some old files (like _*.cfs) to be deleted but the files are still locked
> by operation system because the directory reader is still open. We found by
> debugging FSDirectory class code that all attempts to delete those files
> fail with java.nio.file.AccessDeniedException that is not re-thrown by
> the code. It creates a deadlock that can be only resolved by restarting
> java VM.
> We found a workaround (by overriding FSDirectory. checkPendingDeletions()
> to return 'false' always and opening new reader after re-indexing so the
> old reader is getting closed and the old files are getting unlocked and
> deleted eventually) but I think it's a serious problem and it has to be
> resolved by Lucene developers.
>
> Regards,
> Vitaly Stroutchkov
> Senior software developer at Kloudville Inc.
>
Loading...