memory leak during Lucene Search

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

memory leak during Lucene Search

chrislusf
Unless my understanding is wrong, there is a memory leak for Lucene search
from Lucene-1195.(svn r659602, May23,2008)

This patch brings in a ThreadLocal cache to TermInfosReader.

It's usually recommended to keep the reader open, and reuse it when
possible. In a common J2EE application, the http requests are usually
handled by different threads. But since the cache is ThreadLocal, the cache
are not really usable by other threads. What's worse, the cache can not be
cleared by another thread!

This leak is not so obvious usually. But my case is using RAMDirectory,
having several hundred megabytes. So one un-released resource is obvious to
me.

Here is the reference tree:
org.apache.lucene.store.RAMDirectory
  |- directory of org.apache.lucene.store.RAMFile
      |- file of org.apache.lucene.store.RAMInputStream
          |- base of org.apache.lucene.index.CompoundFileReader$CSIndexInput
              |- input of org.apache.lucene.index.SegmentTermEnum
                  |- value of java.lang.ThreadLocal$ThreadLocalMap$Entry


After I switched back to svn revision 659601, right before this patch is
checked in, the memory leak is gone.
Although my case is RAMDirectory, I believe this will affect disk based
index also.

--
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got
2.6 Million Euro funding!
Reply | Threaded
Open this post in threaded view
|

Re: memory leak during Lucene Search

Grant Ingersoll-2
Just chipping in that I recall there being a number of discussions on  
java-dev about ThreadLocal and web containers and how they should be  
handled.  Not sure if it pertains here or not, but you might find http://lucene.markmail.org/message/keosgz2c2yjc7qre?q=ThreadLocal 
  helpful.

You might also consider asking on java-dev in this case, which usually  
isn't recommended, but this is pretty low-level stuff and seems more  
appropriate there.  It does seem like a problem though and I'm not  
sure of the alternative just yet.

On Sep 7, 2008, at 1:35 AM, Chris Lu wrote:

> Unless my understanding is wrong, there is a memory leak for Lucene  
> search
> from Lucene-1195.(svn r659602, May23,2008)
>
> This patch brings in a ThreadLocal cache to TermInfosReader.
>
> It's usually recommended to keep the reader open, and reuse it when
> possible. In a common J2EE application, the http requests are usually
> handled by different threads. But since the cache is ThreadLocal,  
> the cache
> are not really usable by other threads. What's worse, the cache can  
> not be
> cleared by another thread!
>
> This leak is not so obvious usually. But my case is using  
> RAMDirectory,
> having several hundred megabytes. So one un-released resource is  
> obvious to
> me.
>
> Here is the reference tree:
> org.apache.lucene.store.RAMDirectory
>  |- directory of org.apache.lucene.store.RAMFile
>      |- file of org.apache.lucene.store.RAMInputStream
>          |- base of org.apache.lucene.index.CompoundFileReader
> $CSIndexInput
>              |- input of org.apache.lucene.index.SegmentTermEnum
>                  |- value of java.lang.ThreadLocal$ThreadLocalMap
> $Entry
>
>
> After I switched back to svn revision 659601, right before this  
> patch is
> checked in, the memory leak is gone.
> Although my case is RAMDirectory, I believe this will affect disk  
> based
> index also.
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: memory leak during Lucene Search

chrislusf
Thanks for the link! I will post the problem there.
In the mean time, any J2EE application developers should know this problem
and try to avoid Lucene checked out on or after May 23,2008, svn version
659602.
I tried svn 659601, which worked fine.

I will follow up on this email list when the problem is fixed.

--
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got
2.6 Million Euro funding!

On Tue, Sep 9, 2008 at 11:23 AM, Grant Ingersoll <[hidden email]>wrote:

> Just chipping in that I recall there being a number of discussions on
> java-dev about ThreadLocal and web containers and how they should be
> handled.  Not sure if it pertains here or not, but you might find
> http://lucene.markmail.org/message/keosgz2c2yjc7qre?q=ThreadLocal helpful.
>
> You might also consider asking on java-dev in this case, which usually
> isn't recommended, but this is pretty low-level stuff and seems more
> appropriate there.  It does seem like a problem though and I'm not sure of
> the alternative just yet.
>
> On Sep 7, 2008, at 1:35 AM, Chris Lu wrote:
>
>  Unless my understanding is wrong, there is a memory leak for Lucene search
>> from Lucene-1195.(svn r659602, May23,2008)
>>
>> This patch brings in a ThreadLocal cache to TermInfosReader.
>>
>> It's usually recommended to keep the reader open, and reuse it when
>> possible. In a common J2EE application, the http requests are usually
>> handled by different threads. But since the cache is ThreadLocal, the
>> cache
>> are not really usable by other threads. What's worse, the cache can not be
>> cleared by another thread!
>>
>> This leak is not so obvious usually. But my case is using RAMDirectory,
>> having several hundred megabytes. So one un-released resource is obvious
>> to
>> me.
>>
>> Here is the reference tree:
>> org.apache.lucene.store.RAMDirectory
>>  |- directory of org.apache.lucene.store.RAMFile
>>     |- file of org.apache.lucene.store.RAMInputStream
>>         |- base of org.apache.lucene.index.CompoundFileReader$CSIndexInput
>>             |- input of org.apache.lucene.index.SegmentTermEnum
>>                 |- value of java.lang.ThreadLocal$ThreadLocalMap$Entry
>>
>>
>> After I switched back to svn revision 659601, right before this patch is
>> checked in, the memory leak is gone.
>> Although my case is RAMDirectory, I believe this will affect disk based
>> index also.
>>
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>