Quantcast

IndexSearcher hanging on to old index files in Windows

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

IndexSearcher hanging on to old index files in Windows

Monsur Hossain

Hi all.  I'm running Lucene.NET in a Windows/ASP.NET environment.  We are
searching a 300meg index in a web environment, where the IndexSearcher is
cached.  Every 10-30 minutes, a separate process updates the index.  When
ASP.NET's cache detects a changed index, it drops the current IndexSearcher
(which the Garbage collector takes care of in the future [1]) and creates a
new one.

Now, while the index is being updated, the current IndexSearcher in cache
holds a reference to the old index files.  Therefore, the IndexWriter can't
delete them, and they sit around in the folder, continuing to grow.  Since
the IndexSearcher is left to the GC, there's no guarantee of when the files
will be released.  

I was considering such previously mentioned systems as reference counting
[2] and swapping between two indexes [3].  But in both these cases, I don't
think I'm ever guaranteed that an old IndexSearcher will have released its
grasp on the old files in time to delete them.  

Anyway, I'd like to hear if others are dealing with this issue.

Also, I'm curious, is this a Windows specific issue; I haven't seen any
mention of this on UNIX?

Thanks,
Monsur

[1] http://tinyurl.com/8qzo4
[2] http://tinyurl.com/8enzh
[3] I can't find a link to it, but it was suggested by George Aroush in a
previous thread of mine.



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: IndexSearcher hanging on to old index files in Windows

Chuck Williams
Monsur Hossain writes (4/28/2005 3:10 PM):

>Hi all.  I'm running Lucene.NET in a Windows/ASP.NET environment.  We are
>searching a 300meg index in a web environment, where the IndexSearcher is
>cached.  Every 10-30 minutes, a separate process updates the index.  When
>ASP.NET's cache detects a changed index, it drops the current IndexSearcher
>(which the Garbage collector takes care of in the future [1]) and creates a
>new one.
>
>Now, while the index is being updated, the current IndexSearcher in cache
>holds a reference to the old index files.  Therefore, the IndexWriter can't
>delete them, and they sit around in the folder, continuing to grow.  Since
>the IndexSearcher is left to the GC, there's no guarantee of when the files
>will be released.  
>
>I was considering such previously mentioned systems as reference counting
>[2] and swapping between two indexes [3].  But in both these cases, I don't
>think I'm ever guaranteed that an old IndexSearcher will have released its
>grasp on the old files in time to delete them.  
>
>Anyway, I'd like to hear if others are dealing with this issue.
>  
>
Perhaps I'm not fully understanding your issue, but I did a stress test
recently with a large Lucene index (growing to about 10 million large
documents on a single node) and didn't encounter this problem.  The
system did continual round-the-clock indexing at about 100k
documents/hour with nightly optimizations.  Searching was performed on
the same index on the same node in parallel (taking generally 20 to
200ms per search).  The test harness closed the underlying IndexReader
and reopened a new one every 2 minutes, thus guaranteeing that search
results were up-to-date within 2 minutes.  I wasn't doing deletes, but
old segment files caused by incremental merging and/or optimization were
not hanging around as far as I could tell. This was on the Java version
on Windows.

Chuck


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: IndexSearcher hanging on to old index files in Windows

Nestel, Frank  IZ/HZA-IOL
In reply to this post by Monsur Hossain
Maybe it is .NET specific?! We use a very similar
Szenario with Java under Windows and the Server is
now running for 40 day since we launched it productively.
No problem at all! We have two index directories between
which we switch back and forth though?

Frank

>-----Original Message-----
>From: Monsur Hossain [mailto:[hidden email]]
>Sent: Friday, April 29, 2005 12:11 AM
>To: [hidden email]
>Subject: IndexSearcher hanging on to old index files in Windows
>
>
>
>Hi all.  I'm running Lucene.NET in a Windows/ASP.NET
>environment.  We are searching a 300meg index in a web
>environment, where the IndexSearcher is cached.  Every 10-30
>minutes, a separate process updates the index.  When ASP.NET's
>cache detects a changed index, it drops the current
>IndexSearcher (which the Garbage collector takes care of in
>the future [1]) and creates a new one.
>
>Now, while the index is being updated, the current
>IndexSearcher in cache holds a reference to the old index
>files.  Therefore, the IndexWriter can't delete them, and they
>sit around in the folder, continuing to grow.  Since the
>IndexSearcher is left to the GC, there's no guarantee of when
>the files will be released.  
>
>I was considering such previously mentioned systems as
>reference counting [2] and swapping between two indexes [3].  
>But in both these cases, I don't think I'm ever guaranteed
>that an old IndexSearcher will have released its grasp on the
>old files in time to delete them.  
>
>Anyway, I'd like to hear if others are dealing with this issue.
>
>Also, I'm curious, is this a Windows specific issue; I haven't
>seen any mention of this on UNIX?
>
>Thanks,
>Monsur
>
>[1] http://tinyurl.com/8qzo4
>[2] http://tinyurl.com/8enzh
>[3] I can't find a link to it, but it was suggested by George
>Aroush in a previous thread of mine.
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [hidden email]
>For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: IndexSearcher hanging on to old index files in Windows

Monsur Hossain

Hi there.  Thanks for the input.  I just pulled together a quick set of .NET
console apps to test this out.  I have an app that indexes and an app that
holds an open searcher.  Sure enough, after each incremental index/searcher
refresh, I can't delete the old index files.  I even tried doing a
gc.collect(), with no luck.  

I'm currently porting these two apps over to Java to see if I get different
results; I'll post the code, and let you know what I find.

Monsur

 

> -----Original Message-----
> From: Nestel, Frank IZ/HZA-IOL [mailto:[hidden email]]
> Sent: Thursday, April 28, 2005 7:31 PM
> To: [hidden email]; [hidden email]
> Subject: RE: IndexSearcher hanging on to old index files in Windows
>
> Maybe it is .NET specific?! We use a very similar
> Szenario with Java under Windows and the Server is
> now running for 40 day since we launched it productively.
> No problem at all! We have two index directories between
> which we switch back and forth though?
>
> Frank
>
> >-----Original Message-----
> >From: Monsur Hossain [mailto:[hidden email]]
> >Sent: Friday, April 29, 2005 12:11 AM
> >To: [hidden email]
> >Subject: IndexSearcher hanging on to old index files in Windows
> >
> >
> >
> >Hi all.  I'm running Lucene.NET in a Windows/ASP.NET
> >environment.  We are searching a 300meg index in a web
> >environment, where the IndexSearcher is cached.  Every 10-30
> >minutes, a separate process updates the index.  When ASP.NET's
> >cache detects a changed index, it drops the current
> >IndexSearcher (which the Garbage collector takes care of in
> >the future [1]) and creates a new one.
> >
> >Now, while the index is being updated, the current
> >IndexSearcher in cache holds a reference to the old index
> >files.  Therefore, the IndexWriter can't delete them, and they
> >sit around in the folder, continuing to grow.  Since the
> >IndexSearcher is left to the GC, there's no guarantee of when
> >the files will be released.  
> >
> >I was considering such previously mentioned systems as
> >reference counting [2] and swapping between two indexes [3].  
> >But in both these cases, I don't think I'm ever guaranteed
> >that an old IndexSearcher will have released its grasp on the
> >old files in time to delete them.  
> >
> >Anyway, I'd like to hear if others are dealing with this issue.
> >
> >Also, I'm curious, is this a Windows specific issue; I haven't
> >seen any mention of this on UNIX?
> >
> >Thanks,
> >Monsur
> >
> >[1] http://tinyurl.com/8qzo4
> >[2] http://tinyurl.com/8enzh
> >[3] I can't find a link to it, but it was suggested by George
> >Aroush in a previous thread of mine.
> >
> >
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: [hidden email]
> >For additional commands, e-mail: [hidden email]
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: IndexSearcher hanging on to old index files in Windows

Chuck Williams
Monsur Hossain writes (4/28/2005 4:44 PM):

>Hi there.  Thanks for the input.  I just pulled together a quick set of .NET
>console apps to test this out.  I have an app that indexes and an app that
>holds an open searcher.  Sure enough, after each incremental index/searcher
>refresh, I can't delete the old index files.  I even tried doing a
>gc.collect(), with no luck.  
>  
>
I mentioned this earlier, but just to be explicit, you are closing the
IndexSearcher before abandoning it, right?  And if you opened the
IndexReader separately from the IndexSearcher, then are you also closing
it separately?

Chuck


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: IndexSearcher hanging on to old index files in Windows

Monsur Hossain
In reply to this post by Monsur Hossain
Ok, I've written up a Java test with Lucene 1.4.3, the code is pasted below.
The code creates a new index, creates an IndexSearcher object, and then does
an incremental index/optimize.  The IndexSearcher line is commented out.
When I run this code, I end up with a single "segments", "deletable" and
".cfs" file in my Index directory.

Now, when I uncomment the IndexSearcher line and run the application again,
I end up with two .cfs files.  Notice how all I have to do is create an
IndexSearcher; I don't even have to run a query.  

Am I doing this correctly?

Thanks,
Monsur


import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.analysis.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;


public class SearchTest {
   
    static String indexDir = "C:\\Temp\\Index";
    static int numDocsAdd = 1000;
    static int mergeFactor = 2;
    static int docId = 0;

    public static void main(String[] args) throws Exception {
        System.out.println("Running full index");
        initialIndex();
//        IndexSearcher isearcher = new IndexSearcher(indexDir);
        System.out.println("Running incremental index");
        incrementalIndex();
    }
   
    static void initialIndex() throws Exception {
        // create a new index with 1000 documents
        IndexWriter writerMain = new IndexWriter(indexDir, new
SimpleAnalyzer(), true);
        writerMain.mergeFactor = mergeFactor;
        for (docId = 0; docId < numDocsAdd; docId++)
        {
            Document doc = new Document();
            doc.add(Field.Text("Content", "This is for document number " +
docId));
            doc.add(Field.Keyword("DocID", Integer.toString(docId)));
            writerMain.addDocument(doc);
        }
        writerMain.optimize();
        writerMain.close();
    }
   
    static void incrementalIndex() throws Exception {
        // add 1000 new documents to the index
        IndexWriter writerMain = new IndexWriter(indexDir, new
SimpleAnalyzer(), false);
        writerMain.mergeFactor = mergeFactor;
        int docMax = docId + numDocsAdd;
        for (; docId < docMax; docId++)
        {
            Document doc = new Document();
            doc.add(Field.Text("Content", "This is for document number " +
docId));
            doc.add(Field.Keyword("DocID", Integer.toString(docId)));
            writerMain.addDocument(doc);
        }
        writerMain.optimize();
        writerMain.close();
    }
}



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: IndexSearcher hanging on to old index files in Windows

Monsur Hossain
In reply to this post by Chuck Williams

Well, that's part of my question, sorry if it wasn't clear.  I'm not
explicitly closing the IndexSearcher, because there may be users still using
it.  Instead, I'm creating a new IndexSearcher, and leaving the old
IndexSearcher to be cleaned up by the GC, as suggested here:

http://tinyurl.com/8qzo4

In the example I just sent out, the IndexSearcher is open during the
incremental index, which could be the issue.  Its sort of like a catch-22
right now: I can't close the old IndexSearcher until the new index is ready,
and by the time the new index is ready, its too late to delete the old
files.

Thanks,
Monsur



> -----Original Message-----
> From: Chuck Williams [mailto:[hidden email]]
> Sent: Thursday, April 28, 2005 10:09 PM
> To: [hidden email]
> Subject: Re: IndexSearcher hanging on to old index files in Windows
>
> Monsur Hossain writes (4/28/2005 4:44 PM):
>
> >Hi there.  Thanks for the input.  I just pulled together a
> quick set of .NET
> >console apps to test this out.  I have an app that indexes
> and an app that
> >holds an open searcher.  Sure enough, after each incremental
> index/searcher
> >refresh, I can't delete the old index files.  I even tried doing a
> >gc.collect(), with no luck.  
> >  
> >
> I mentioned this earlier, but just to be explicit, you are
> closing the
> IndexSearcher before abandoning it, right?  And if you opened the
> IndexReader separately from the IndexSearcher, then are you
> also closing
> it separately?
>
> Chuck
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: IndexSearcher hanging on to old index files in Windows

Otis Gospodnetic-2
In reply to this post by Monsur Hossain
Just tried this on my linux laptop - with IndexSearcher uncommented, I
still get a single .cfs file.  It's one of those problems where Windows
doesn't let you erase the file.  I'd start this SortTest in the
debugger and step through it until you find a spot where you see that
some index file deletion fails.
Do you get 2 .cfs files even if you add isearcher.close() right after
you open the IndexSearcher?

Otis

--- Monsur Hossain <[hidden email]> wrote:

> Ok, I've written up a Java test with Lucene 1.4.3, the code is pasted
> below.
> The code creates a new index, creates an IndexSearcher object, and
> then does
> an incremental index/optimize.  The IndexSearcher line is commented
> out.
> When I run this code, I end up with a single "segments", "deletable"
> and
> ".cfs" file in my Index directory.
>
> Now, when I uncomment the IndexSearcher line and run the application
> again,
> I end up with two .cfs files.  Notice how all I have to do is create
> an
> IndexSearcher; I don't even have to run a query.  
>
> Am I doing this correctly?
>
> Thanks,
> Monsur
>
>
> import org.apache.lucene.search.IndexSearcher;
> import org.apache.lucene.index.IndexWriter;
> import org.apache.lucene.analysis.SimpleAnalyzer;
> import org.apache.lucene.document.Document;
> import org.apache.lucene.document.Field;
>
>
> public class SearchTest {
>    
>     static String indexDir = "C:\\Temp\\Index";
>     static int numDocsAdd = 1000;
>     static int mergeFactor = 2;
>     static int docId = 0;
>
>     public static void main(String[] args) throws Exception {
>         System.out.println("Running full index");
>         initialIndex();
> //        IndexSearcher isearcher = new IndexSearcher(indexDir);
>         System.out.println("Running incremental index");
>         incrementalIndex();
>     }
>    
>     static void initialIndex() throws Exception {
>         // create a new index with 1000 documents
>         IndexWriter writerMain = new IndexWriter(indexDir, new
> SimpleAnalyzer(), true);
>         writerMain.mergeFactor = mergeFactor;
>         for (docId = 0; docId < numDocsAdd; docId++)
>         {
>             Document doc = new Document();
>             doc.add(Field.Text("Content", "This is for document
> number " +
> docId));
>             doc.add(Field.Keyword("DocID", Integer.toString(docId)));
>             writerMain.addDocument(doc);
>         }
>         writerMain.optimize();
>         writerMain.close();
>     }
>    
>     static void incrementalIndex() throws Exception {
>         // add 1000 new documents to the index
>         IndexWriter writerMain = new IndexWriter(indexDir, new
> SimpleAnalyzer(), false);
>         writerMain.mergeFactor = mergeFactor;
>         int docMax = docId + numDocsAdd;
>         for (; docId < docMax; docId++)
>         {
>             Document doc = new Document();
>             doc.add(Field.Text("Content", "This is for document
> number " +
> docId));
>             doc.add(Field.Keyword("DocID", Integer.toString(docId)));
>             writerMain.addDocument(doc);
>         }
>         writerMain.optimize();
>         writerMain.close();
>     }
> }
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: IndexSearcher hanging on to old index files in Windows

Monsur Hossain
> Do you get 2 .cfs files even if you add isearcher.close() right after
> you open the IndexSearcher?

Nope!  Adding the close() right after the open gives me one .cfs file.

Monsur



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: IndexSearcher hanging on to old index files in Windows

Monsur Hossain
In reply to this post by Otis Gospodnetic-2
> Just tried this on my linux laptop - with IndexSearcher uncommented, I
> still get a single .cfs file.  It's one of those problems
> where Windows
> doesn't let you erase the file.  I'd start this SortTest in the
> debugger and step through it until you find a spot where you see that
> some index file deletion fails.


Using the debugger and Process Explorer, I've been spending the morning
learning about file management in Lucene, and its fascinating stuff!

Sure enough, the IndexSearcher opens a handle to the first .cfs file.  After
the incremental index is updated and optimized, the IndexWriter tries to
delete the old .cfs file, but fails with the error:

"The process cannot access the file because it is being used by another
process."

It then sticks the filename in Lucene's "deletable" file to be deleted at
some later time.  As a sanity check I used Process Explorer to delete the
file handle before running the incremental index, and it worked fine.

Since this is happening in both .NET and Java, I'm assuming its Windows
specific.  I don't know much about Windows low-level file management, but
I'm going to keep digging into this issue further.  Does anyone have any
expertise in this area, or know why this behavior is different in Windows
vs. Linux?

In the meantime, I think I can just ignore this issue, since the old .cfs
name is stored in the "deletable" file.  I'm guessing that at some later
point, once the IndexSearcher has been garbage collected, Lucene will load
that filename from "deletable", and then delete it.  I just worry that my
index will become too cluttered in the meantime (especially after an
optimization).

Thanks,
Monsur



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: IndexSearcher hanging on to old index files in Windows

Monsur Hossain
In reply to this post by Otis Gospodnetic-2

> Just tried this on my linux laptop - with IndexSearcher uncommented, I
> still get a single .cfs file.

Hmmm, rereading this, I'm curious to know how/why this works in Linux.
Consider this scenario:

1) Create a new index

2) Create a new IndexSearcher pointing to that index.

3) Run an incremental index/optimize.  At this point, the new, optimized cfs
file is created, the old cfs file is deleted, and the segments file is
updated to point to the new cfs file.

4) Run a search using the IndexSearcher created in step 2.

At this point, how does the IndexSearcher know that the segments have been
updated and that it should try to read from the new cfs file?

Thanks,
Monsur



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: IndexSearcher hanging on to old index files in Windows

Chuck Williams
In reply to this post by Monsur Hossain
Monsur Hossain wrote:

>"The process cannot access the file because it is being used by another
>process."
>
>It then sticks the filename in Lucene's "deletable" file to be deleted at
>some later time.  As a sanity check I used Process Explorer to delete the
>file handle before running the incremental index, and it worked fine.
>
>Since this is happening in both .NET and Java, I'm assuming its Windows
>specific.  I don't know much about Windows low-level file management, but
>I'm going to keep digging into this issue further.  Does anyone have any
>expertise in this area, or know why this behavior is different in Windows
>vs. Linux?
>  
>
Yes, Windows won't allow you to delete a file that is separately open.
Linux will because it can disconnect the inode. The actual data blocks
in the file won't be deleted until nobody is accessing them. Windows
doesn't have that kind of garbage collection in the OS itself.

I'm confident you don't have to worry about that. As mentioned earlier,
I ran a large scalability benchmark on Windows and everything did end up
properly. I ran this test a little differently than letting the
IndexSearcher get garbage collected. Instead, I explicitly closed the
searcher (reader) and reopened it periodically. I used explicit
synchronization involving counts of searches in process, such that when
it can time to refresh the searcher, new searches would wait until the
searcher was refreshed and the refresh operation would wait until the
searches already in progress had completed. This did not seem to add any
significant delays to search times even on a very large index.

The garbage collection approach should work as well.

Chuck


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: IndexSearcher hanging on to old index files in Windows

Chuck Williams
In reply to this post by Monsur Hossain
Monsur Hossain wrote:

>>Just tried this on my linux laptop - with IndexSearcher uncommented, I
>>still get a single .cfs file.
>>    
>>
>
>Hmmm, rereading this, I'm curious to know how/why this works in Linux.
>Consider this scenario:
>
>1) Create a new index
>
>2) Create a new IndexSearcher pointing to that index.
>
>3) Run an incremental index/optimize.  At this point, the new, optimized cfs
>file is created, the old cfs file is deleted, and the segments file is
>updated to point to the new cfs file.
>
>4) Run a search using the IndexSearcher created in step 2.
>
>At this point, how does the IndexSearcher know that the segments have been
>updated and that it should try to read from the new cfs file?
>  
>
It doesn't.  It won't see the changes to the index until you close it
and open a new one.  On Linux, the IndexSearcher can point to the old
index although the index directory no longer does.

Chuck


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: IndexSearcher hanging on to old index files in Windows

Monsur Hossain
In reply to this post by Chuck Williams
> I ran this test a little differently than letting the
> IndexSearcher get garbage collected. Instead, I explicitly closed the
> searcher (reader) and reopened it periodically.

Thanks Chuck, this is all really helpful.  That explicit close() is what
allows the files stored up in "deletable" to eventually be deleted.  I'm
wary of relying on the GC to clean up my work, so I think I'll use that
reference counting system you mentioned.  That way I can be guaranteed that
at some point, my IndexSearcher is in fact closed.  (In my tests, when I
left it up to the GC, these open file handles stuck around for hours).

Thanks,
Monsur



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: IndexSearcher hanging on to old index files in Windows

Luke Francl
On Fri, 2005-04-29 at 14:29, Monsur Hossain wrote:

> Thanks Chuck, this is all really helpful.  That explicit close() is what
> allows the files stored up in "deletable" to eventually be deleted.  I'm
> wary of relying on the GC to clean up my work, so I think I'll use that
> reference counting system you mentioned.  That way I can be guaranteed that
> at some point, my IndexSearcher is in fact closed.  (In my tests, when I
> left it up to the GC, these open file handles stuck around for hours).

I really recommend against relying on the GC to clean up operating
system resources. It's just not reliable, especially in long-running VMs
with lots of memory (like an application server).

This can leave file handles open indefinitely, which can lead to
problems in Windows with too many open files, or deletion, as you've
seen.

I also implemented a reference counting scheme for IndexSearchers and it
works well.

Regards,
Luke Francl


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: IndexSearcher hanging on to old index files in Windows

Aaron Loucks
In reply to this post by Monsur Hossain
I had this exact same problem in a j2ee + windows environment. Even when
the index searcher was closed, it still seemed to hold onto the file. As
I recall I solved this by only caching the IndexSearcher long enough to
get the results I wanted to display. If the user would look at the next
page of results the query would be run again. This didn't affect
performance.

Aaron Loucks
Web Developer
Gardner, Inc.
3641 Interchange Road
Columbus, OH 43204
614.456.3492
-----Original Message-----
From: Monsur Hossain [mailto:[hidden email]]
Sent: Thursday, April 28, 2005 6:11 PM
To: [hidden email]
Subject: IndexSearcher hanging on to old index files in Windows


Hi all.  I'm running Lucene.NET in a Windows/ASP.NET environment.  We
are
searching a 300meg index in a web environment, where the IndexSearcher
is
cached.  Every 10-30 minutes, a separate process updates the index.
When
ASP.NET's cache detects a changed index, it drops the current
IndexSearcher
(which the Garbage collector takes care of in the future [1]) and
creates a
new one.

Now, while the index is being updated, the current IndexSearcher in
cache
holds a reference to the old index files.  Therefore, the IndexWriter
can't
delete them, and they sit around in the folder, continuing to grow.
Since
the IndexSearcher is left to the GC, there's no guarantee of when the
files
will be released.  

I was considering such previously mentioned systems as reference
counting
[2] and swapping between two indexes [3].  But in both these cases, I
don't
think I'm ever guaranteed that an old IndexSearcher will have released
its
grasp on the old files in time to delete them.  

Anyway, I'd like to hear if others are dealing with this issue.

Also, I'm curious, is this a Windows specific issue; I haven't seen any
mention of this on UNIX?

Thanks,
Monsur

[1] http://tinyurl.com/8qzo4
[2] http://tinyurl.com/8enzh
[3] I can't find a link to it, but it was suggested by George Aroush in
a
previous thread of mine.



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Loading...