DefaultIndexAccessor

classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|

DefaultIndexAccessor

Cam Bazz
Hello,

Regarding https://issues.apache.org/jira/browse/LUCENE-1026 , this seems
very interesting. I have read the discussion on the page, but I could not
figure out which set of files is the latest.
Is it the IndexAccessor-1.26.2008.zip file?

I will read through the code, make my own tests, and send some feedback.

Best.
-C.B.
Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

Mark Miller-3
IndexAccessor-1.26.2008.zip is the latest one. I will be dating a zip from now on.

I hope to post new code with the warming either tonight or tomorrow night. I would be ecstatic to have some help vetting that.

Also, I am thinking of making a change so that when you release the Writer the thread that releases does not block until reopen. I think the original author did this so that if you add a doc with a thread and then immediately search from the same thread, you are guaranteed to find the doc. However, this gaurentee did not hold -- if another thread had a reference to the Writer and a new thread grabbed a Writer and then quicly released before the first thread, you will have added a doc but it will not be visible until the first thread releases its reference to the Writer...since the concept is not enforced anyway, you might as well not block for the final thread that releases the Writer either. Instead I will grab a thread from a thread pool to do the reopening with that thread, and return right after closing the Writer. The result is that you cannot add a doc and search and expect to find it without waiting a second or too. But this way things will be consistent, and an app that adds docs will be a bit more responsive....eg it wont hang as Readers are being reopened.

I also have to bring the AccessProvider classes back. No easy way to use your own custom Readers without it...I shouldn't have stripped it out.

- Mark



Cam Bazz wrote:

> Hello,
>
> Regarding https://issues.apache.org/jira/browse/LUCENE-1026 , this seems
> very interesting. I have read the discussion on the page, but I could not
> figure out which set of files is the latest.
> Is it the IndexAccessor-1.26.2008.zip file?
>
> I will read through the code, make my own tests, and send some feedback.
>
> Best.
> -C.B.
>
>  


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

Cam Bazz
Hello Mark,

I have been reading the code - and honestly I have not understood how it
works. I was hoping that this was a solution to the case when you are adding
documents - in a multithreaded way, it allows other non-writer threads to be
able to see documents added without refreshing the indexsearcher - by using
some caching mechanism.

Could you elaborate what IndexAccessor does and how it does it a little bit
more?

Best Regards,
-C.B.

On Feb 4, 2008 3:06 PM, Mark Miller <[hidden email]> wrote:

> IndexAccessor-1.26.2008.zip is the latest one. I will be dating a zip from
> now on.
>
> I hope to post new code with the warming either tonight or tomorrow night.
> I would be ecstatic to have some help vetting that.
>
> Also, I am thinking of making a change so that when you release the Writer
> the thread that releases does not block until reopen. I think the original
> author did this so that if you add a doc with a thread and then immediately
> search from the same thread, you are guaranteed to find the doc. However,
> this gaurentee did not hold -- if another thread had a reference to the
> Writer and a new thread grabbed a Writer and then quicly released before the
> first thread, you will have added a doc but it will not be visible until the
> first thread releases its reference to the Writer...since the concept is not
> enforced anyway, you might as well not block for the final thread that
> releases the Writer either. Instead I will grab a thread from a thread pool
> to do the reopening with that thread, and return right after closing the
> Writer. The result is that you cannot add a doc and search and expect to
> find it without waiting a second or too. But this way things will be
> consistent, and an app that adds docs will be a bit more responsive....eg it
> wont hang as Readers are being reopened.
>
> I also have to bring the AccessProvider classes back. No easy way to use
> your own custom Readers without it...I shouldn't have stripped it out.
>
> - Mark
>
>
>
> Cam Bazz wrote:
> > Hello,
> >
> > Regarding https://issues.apache.org/jira/browse/LUCENE-1026 , this seems
> > very interesting. I have read the discussion on the page, but I could
> not
> > figure out which set of files is the latest.
> > Is it the IndexAccessor-1.26.2008.zip file?
> >
> > I will read through the code, make my own tests, and send some feedback.
> >
> > Best.
> > -C.B.
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

Mark Miller-3
The purpose of IndexAccessor is to coordinate Readers/Writers for a
Lucene index. Readers and Writers in Lucene are multi-threaded in that
multiple threads may use them at the same time, but they must/should be
shared and there are special rules (You cannot delete with a Reader
while a Writer is working on the index). Also, you need to refresh
Reader views every so often; this is expensive (though usually much less
so with the new reopen method).

IndexAccessor enforces the rules and controls Reader refreshing. Instead
of worrying about caching or index interaction rules, you just ask for
your Reader/Writer, use it to search or add a doc, and then return it.
The rest is taken care of for you.

This is done by keeping a cached Writer and Searcher(s) that all threads
share. References to the Searchers are counted so that after a Writer is
returned (and no other thread has a reference to the Writer),
IndexAccessor waits for all of the current Searchers to come back and
then reopens their Readers.

In this regard, you get a  similar setup to what Solr might give: from
any thread you just add docs and run searches -- you don't have to worry
about refreshing Readers or sharing Writers/Readers or one thread
deleting with a Reader while another thread tries to write with a Writer.

This setup allows you to do other cool things, like warm Searchers
before putting them into action. Thats what the code I am posting soon
is be capable of - when the Readers are reopened, search requests will
still be handled by the old Readers while the new Searchers run a sample
query with optional sort fields. This will make sure the Reader is open
and its sort caches are loaded before the first thread tries to use it.
Much faster response to applications.

You must  open a new Reader or reopen a Reader to see recently added
docs...IndexAccessor provides no real way around that. But it does make
the reopening much easier -- and your application that just wants to add
docs and search at will from multiple threads, won't have to worry about it.

You can bail out here, or if you want further clarification I will
include an alternate attempt at what IndexAccessor is below.

- Mark

----------------------------------------------------------------------------------------------------
When accessing a Lucene index from multiple threads, there are a variety
of issues that you must address.

1. The Readers/Writer should be shared across threads.
2. Readers must periodically be refreshed, either be creating new
instances or using the new reopen method.
3. A Reader that writes needs to be properly coordinated with a Writer
eg they cannot be used at the same time.

IndexAccessor addresses each of these issues.

How it works:

A single Writer is shared among threads that try to concurrently
retrieve and use a Writer. Once all of these threads release their
reference
to the Writer, it is closed and upon the next request a new one is created.

A single Searcher for each Similarity is also shared across threads.
Upon first request, a new Searcher is created. This Searcher is then
returned
upon every request. A count of every Searcher reference retrieved is
maintained.

When all references to a Writer are released, the Writer is closed and
after waiting for all of the Searchers to be returned, the Searchers are
reopened. Without warming enabled, new requests for Searchers/Readers
must wait for this reopen to complete. If warming is enabled, the old
Searchers/Readers continue handling Searcher requests until the Readers
have been reopened and any requested sort caches have been loaded.

If you ask for a writing Reader, you will not get it until a Writer is
released and vice versa.

The result is that you can freely use Writers/Readers/Searchers from any
thread without considering thread interactions. ***

If you want to add docs, just ask for a Writer, add the docs, and
release the Writer. If you want to search, get a Searcher, search,
and release the Searcher. You don't have to worry about reopening
Readers or coordinating access.


***
You still do have to consider things like hogging the Writer/Readers -
if you don't occasionally release them, things will not stay very
interactive.
The best method is to just get the object, use it, and then return it in
a finally block. Batch load multiple docs, but if your just randomly adding
a doc, get the Writer, add it, and then release the Writer in a finally
block. If you are batch loading a million docs and you want to be able
to see them
as they are added: get the writer and add 10,000 docs (or something),
release the Writer, get the Writer and add 10,000 docs, etc.

Cam Bazz wrote:

> Hello Mark,
>
> I have been reading the code - and honestly I have not understood how it
> works. I was hoping that this was a solution to the case when you are adding
> documents - in a multithreaded way, it allows other non-writer threads to be
> able to see documents added without refreshing the indexsearcher - by using
> some caching mechanism.
>
> Could you elaborate what IndexAccessor does and how it does it a little bit
> more?
>
> Best Regards,
> -C.B.
>
> On Feb 4, 2008 3:06 PM, Mark Miller <[hidden email]> wrote:
>
>  
>> IndexAccessor-1.26.2008.zip is the latest one. I will be dating a zip from
>> now on.
>>
>> I hope to post new code with the warming either tonight or tomorrow night.
>> I would be ecstatic to have some help vetting that.
>>
>> Also, I am thinking of making a change so that when you release the Writer
>> the thread that releases does not block until reopen. I think the original
>> author did this so that if you add a doc with a thread and then immediately
>> search from the same thread, you are guaranteed to find the doc. However,
>> this gaurentee did not hold -- if another thread had a reference to the
>> Writer and a new thread grabbed a Writer and then quicly released before the
>> first thread, you will have added a doc but it will not be visible until the
>> first thread releases its reference to the Writer...since the concept is not
>> enforced anyway, you might as well not block for the final thread that
>> releases the Writer either. Instead I will grab a thread from a thread pool
>> to do the reopening with that thread, and return right after closing the
>> Writer. The result is that you cannot add a doc and search and expect to
>> find it without waiting a second or too. But this way things will be
>> consistent, and an app that adds docs will be a bit more responsive....eg it
>> wont hang as Readers are being reopened.
>>
>> I also have to bring the AccessProvider classes back. No easy way to use
>> your own custom Readers without it...I shouldn't have stripped it out.
>>
>> - Mark
>>
>>
>>
>> Cam Bazz wrote:
>>    
>>> Hello,
>>>
>>> Regarding https://issues.apache.org/jira/browse/LUCENE-1026 , this seems
>>> very interesting. I have read the discussion on the page, but I could
>>>      
>> not
>>    
>>> figure out which set of files is the latest.
>>> Is it the IndexAccessor-1.26.2008.zip file?
>>>
>>> I will read through the code, make my own tests, and send some feedback.
>>>
>>> Best.
>>> -C.B.
>>>
>>>
>>>      
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>>    
>
>  

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

Cam Bazz
Hello Mark,

Thank you for your lengthy and valuable clarification. I have the case -
before adding to the index, i must check if a document exist with the
same key (actually, double key) - or before deleting a document - I must
ensure it exists in the index.

Currently I am doing it with my custom caching routine. It works quite well
upto 32M documents. but after that something happens and it really slows
down.

I will experiment with your implementation, as soon as I can. It is very
cool by the way. Will it be included in the next release?

Best,
-C.B.

On Feb 4, 2008 7:15 PM, Mark Miller <[hidden email]> wrote:

> The purpose of IndexAccessor is to coordinate Readers/Writers for a
> Lucene index. Readers and Writers in Lucene are multi-threaded in that
> multiple threads may use them at the same time, but they must/should be
> shared and there are special rules (You cannot delete with a Reader
> while a Writer is working on the index). Also, you need to refresh
> Reader views every so often; this is expensive (though usually much less
> so with the new reopen method).
>
> IndexAccessor enforces the rules and controls Reader refreshing. Instead
> of worrying about caching or index interaction rules, you just ask for
> your Reader/Writer, use it to search or add a doc, and then return it.
> The rest is taken care of for you.
>
> This is done by keeping a cached Writer and Searcher(s) that all threads
> share. References to the Searchers are counted so that after a Writer is
> returned (and no other thread has a reference to the Writer),
> IndexAccessor waits for all of the current Searchers to come back and
> then reopens their Readers.
>
> In this regard, you get a  similar setup to what Solr might give: from
> any thread you just add docs and run searches -- you don't have to worry
> about refreshing Readers or sharing Writers/Readers or one thread
> deleting with a Reader while another thread tries to write with a Writer.
>
> This setup allows you to do other cool things, like warm Searchers
> before putting them into action. Thats what the code I am posting soon
> is be capable of - when the Readers are reopened, search requests will
> still be handled by the old Readers while the new Searchers run a sample
> query with optional sort fields. This will make sure the Reader is open
> and its sort caches are loaded before the first thread tries to use it.
> Much faster response to applications.
>
> You must  open a new Reader or reopen a Reader to see recently added
> docs...IndexAccessor provides no real way around that. But it does make
> the reopening much easier -- and your application that just wants to add
> docs and search at will from multiple threads, won't have to worry about
> it.
>
> You can bail out here, or if you want further clarification I will
> include an alternate attempt at what IndexAccessor is below.
>
> - Mark
>
>
> ----------------------------------------------------------------------------------------------------
> When accessing a Lucene index from multiple threads, there are a variety
> of issues that you must address.
>
> 1. The Readers/Writer should be shared across threads.
> 2. Readers must periodically be refreshed, either be creating new
> instances or using the new reopen method.
> 3. A Reader that writes needs to be properly coordinated with a Writer
> eg they cannot be used at the same time.
>
> IndexAccessor addresses each of these issues.
>
> How it works:
>
> A single Writer is shared among threads that try to concurrently
> retrieve and use a Writer. Once all of these threads release their
> reference
> to the Writer, it is closed and upon the next request a new one is
> created.
>
> A single Searcher for each Similarity is also shared across threads.
> Upon first request, a new Searcher is created. This Searcher is then
> returned
> upon every request. A count of every Searcher reference retrieved is
> maintained.
>
> When all references to a Writer are released, the Writer is closed and
> after waiting for all of the Searchers to be returned, the Searchers are
> reopened. Without warming enabled, new requests for Searchers/Readers
> must wait for this reopen to complete. If warming is enabled, the old
> Searchers/Readers continue handling Searcher requests until the Readers
> have been reopened and any requested sort caches have been loaded.
>
> If you ask for a writing Reader, you will not get it until a Writer is
> released and vice versa.
>
> The result is that you can freely use Writers/Readers/Searchers from any
> thread without considering thread interactions. ***
>
> If you want to add docs, just ask for a Writer, add the docs, and
> release the Writer. If you want to search, get a Searcher, search,
> and release the Searcher. You don't have to worry about reopening
> Readers or coordinating access.
>
>
> ***
> You still do have to consider things like hogging the Writer/Readers -
> if you don't occasionally release them, things will not stay very
> interactive.
> The best method is to just get the object, use it, and then return it in
> a finally block. Batch load multiple docs, but if your just randomly
> adding
> a doc, get the Writer, add it, and then release the Writer in a finally
> block. If you are batch loading a million docs and you want to be able
> to see them
> as they are added: get the writer and add 10,000 docs (or something),
> release the Writer, get the Writer and add 10,000 docs, etc.
>
> Cam Bazz wrote:
> > Hello Mark,
> >
> > I have been reading the code - and honestly I have not understood how it
> > works. I was hoping that this was a solution to the case when you are
> adding
> > documents - in a multithreaded way, it allows other non-writer threads
> to be
> > able to see documents added without refreshing the indexsearcher - by
> using
> > some caching mechanism.
> >
> > Could you elaborate what IndexAccessor does and how it does it a little
> bit
> > more?
> >
> > Best Regards,
> > -C.B.
> >
> > On Feb 4, 2008 3:06 PM, Mark Miller <[hidden email]> wrote:
> >
> >
> >> IndexAccessor-1.26.2008.zip is the latest one. I will be dating a zip
> from
> >> now on.
> >>
> >> I hope to post new code with the warming either tonight or tomorrow
> night.
> >> I would be ecstatic to have some help vetting that.
> >>
> >> Also, I am thinking of making a change so that when you release the
> Writer
> >> the thread that releases does not block until reopen. I think the
> original
> >> author did this so that if you add a doc with a thread and then
> immediately
> >> search from the same thread, you are guaranteed to find the doc.
> However,
> >> this gaurentee did not hold -- if another thread had a reference to the
> >> Writer and a new thread grabbed a Writer and then quicly released
> before the
> >> first thread, you will have added a doc but it will not be visible
> until the
> >> first thread releases its reference to the Writer...since the concept
> is not
> >> enforced anyway, you might as well not block for the final thread that
> >> releases the Writer either. Instead I will grab a thread from a thread
> pool
> >> to do the reopening with that thread, and return right after closing
> the
> >> Writer. The result is that you cannot add a doc and search and expect
> to
> >> find it without waiting a second or too. But this way things will be
> >> consistent, and an app that adds docs will be a bit more
> responsive....eg it
> >> wont hang as Readers are being reopened.
> >>
> >> I also have to bring the AccessProvider classes back. No easy way to
> use
> >> your own custom Readers without it...I shouldn't have stripped it out.
> >>
> >> - Mark
> >>
> >>
> >>
> >> Cam Bazz wrote:
> >>
> >>> Hello,
> >>>
> >>> Regarding https://issues.apache.org/jira/browse/LUCENE-1026 , this
> seems
> >>> very interesting. I have read the discussion on the page, but I could
> >>>
> >> not
> >>
> >>> figure out which set of files is the latest.
> >>> Is it the IndexAccessor-1.26.2008.zip file?
> >>>
> >>> I will read through the code, make my own tests, and send some
> feedback.
> >>>
> >>> Best.
> >>> -C.B.
> >>>
> >>>
> >>>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [hidden email]
> >> For additional commands, e-mail: [hidden email]
> >>
> >>
> >>
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

Mark Miller-3
For anyone following this thread who would like to check this out, I put
up the new code with the warming capability:

https://issues.apache.org/jira/browse/LUCENE-1026
<https://issues.apache.org/jira/secure/attachment/12374729/IndexAccessor-02.04.2008.zip>
IndexAccessor-02.04.2008.zip
<https://issues.apache.org/jira/secure/attachment/12374729/IndexAccessor-02.04.2008.zip>
(32 kb)

See the comment at the bottom.

Cam Bazz wrote:

> Hello Mark,
>
> Thank you for your lengthy and valuable clarification. I have the case -
> before adding to the index, i must check if a document exist with the
> same key (actually, double key) - or before deleting a document - I must
> ensure it exists in the index.
>
> Currently I am doing it with my custom caching routine. It works quite well
> upto 32M documents. but after that something happens and it really slows
> down.
>
> I will experiment with your implementation, as soon as I can. It is very
> cool by the way. Will it be included in the next release?
>
> Best,
> -C.B.
>
> On Feb 4, 2008 7:15 PM, Mark Miller <[hidden email]> wrote:
>
>  
>> The purpose of IndexAccessor is to coordinate Readers/Writers for a
>> Lucene index. Readers and Writers in Lucene are multi-threaded in that
>> multiple threads may use them at the same time, but they must/should be
>> shared and there are special rules (You cannot delete with a Reader
>> while a Writer is working on the index). Also, you need to refresh
>> Reader views every so often; this is expensive (though usually much less
>> so with the new reopen method).
>>
>> IndexAccessor enforces the rules and controls Reader refreshing. Instead
>> of worrying about caching or index interaction rules, you just ask for
>> your Reader/Writer, use it to search or add a doc, and then return it.
>> The rest is taken care of for you.
>>
>> This is done by keeping a cached Writer and Searcher(s) that all threads
>> share. References to the Searchers are counted so that after a Writer is
>> returned (and no other thread has a reference to the Writer),
>> IndexAccessor waits for all of the current Searchers to come back and
>> then reopens their Readers.
>>
>> In this regard, you get a  similar setup to what Solr might give: from
>> any thread you just add docs and run searches -- you don't have to worry
>> about refreshing Readers or sharing Writers/Readers or one thread
>> deleting with a Reader while another thread tries to write with a Writer.
>>
>> This setup allows you to do other cool things, like warm Searchers
>> before putting them into action. Thats what the code I am posting soon
>> is be capable of - when the Readers are reopened, search requests will
>> still be handled by the old Readers while the new Searchers run a sample
>> query with optional sort fields. This will make sure the Reader is open
>> and its sort caches are loaded before the first thread tries to use it.
>> Much faster response to applications.
>>
>> You must  open a new Reader or reopen a Reader to see recently added
>> docs...IndexAccessor provides no real way around that. But it does make
>> the reopening much easier -- and your application that just wants to add
>> docs and search at will from multiple threads, won't have to worry about
>> it.
>>
>> You can bail out here, or if you want further clarification I will
>> include an alternate attempt at what IndexAccessor is below.
>>
>> - Mark
>>
>>
>> ----------------------------------------------------------------------------------------------------
>> When accessing a Lucene index from multiple threads, there are a variety
>> of issues that you must address.
>>
>> 1. The Readers/Writer should be shared across threads.
>> 2. Readers must periodically be refreshed, either be creating new
>> instances or using the new reopen method.
>> 3. A Reader that writes needs to be properly coordinated with a Writer
>> eg they cannot be used at the same time.
>>
>> IndexAccessor addresses each of these issues.
>>
>> How it works:
>>
>> A single Writer is shared among threads that try to concurrently
>> retrieve and use a Writer. Once all of these threads release their
>> reference
>> to the Writer, it is closed and upon the next request a new one is
>> created.
>>
>> A single Searcher for each Similarity is also shared across threads.
>> Upon first request, a new Searcher is created. This Searcher is then
>> returned
>> upon every request. A count of every Searcher reference retrieved is
>> maintained.
>>
>> When all references to a Writer are released, the Writer is closed and
>> after waiting for all of the Searchers to be returned, the Searchers are
>> reopened. Without warming enabled, new requests for Searchers/Readers
>> must wait for this reopen to complete. If warming is enabled, the old
>> Searchers/Readers continue handling Searcher requests until the Readers
>> have been reopened and any requested sort caches have been loaded.
>>
>> If you ask for a writing Reader, you will not get it until a Writer is
>> released and vice versa.
>>
>> The result is that you can freely use Writers/Readers/Searchers from any
>> thread without considering thread interactions. ***
>>
>> If you want to add docs, just ask for a Writer, add the docs, and
>> release the Writer. If you want to search, get a Searcher, search,
>> and release the Searcher. You don't have to worry about reopening
>> Readers or coordinating access.
>>
>>
>> ***
>> You still do have to consider things like hogging the Writer/Readers -
>> if you don't occasionally release them, things will not stay very
>> interactive.
>> The best method is to just get the object, use it, and then return it in
>> a finally block. Batch load multiple docs, but if your just randomly
>> adding
>> a doc, get the Writer, add it, and then release the Writer in a finally
>> block. If you are batch loading a million docs and you want to be able
>> to see them
>> as they are added: get the writer and add 10,000 docs (or something),
>> release the Writer, get the Writer and add 10,000 docs, etc.
>>
>> Cam Bazz wrote:
>>    
>>> Hello Mark,
>>>
>>> I have been reading the code - and honestly I have not understood how it
>>> works. I was hoping that this was a solution to the case when you are
>>>      
>> adding
>>    
>>> documents - in a multithreaded way, it allows other non-writer threads
>>>      
>> to be
>>    
>>> able to see documents added without refreshing the indexsearcher - by
>>>      
>> using
>>    
>>> some caching mechanism.
>>>
>>> Could you elaborate what IndexAccessor does and how it does it a little
>>>      
>> bit
>>    
>>> more?
>>>
>>> Best Regards,
>>> -C.B.
>>>
>>> On Feb 4, 2008 3:06 PM, Mark Miller <[hidden email]> wrote:
>>>
>>>
>>>      
>>>> IndexAccessor-1.26.2008.zip is the latest one. I will be dating a zip
>>>>        
>> from
>>    
>>>> now on.
>>>>
>>>> I hope to post new code with the warming either tonight or tomorrow
>>>>        
>> night.
>>    
>>>> I would be ecstatic to have some help vetting that.
>>>>
>>>> Also, I am thinking of making a change so that when you release the
>>>>        
>> Writer
>>    
>>>> the thread that releases does not block until reopen. I think the
>>>>        
>> original
>>    
>>>> author did this so that if you add a doc with a thread and then
>>>>        
>> immediately
>>    
>>>> search from the same thread, you are guaranteed to find the doc.
>>>>        
>> However,
>>    
>>>> this gaurentee did not hold -- if another thread had a reference to the
>>>> Writer and a new thread grabbed a Writer and then quicly released
>>>>        
>> before the
>>    
>>>> first thread, you will have added a doc but it will not be visible
>>>>        
>> until the
>>    
>>>> first thread releases its reference to the Writer...since the concept
>>>>        
>> is not
>>    
>>>> enforced anyway, you might as well not block for the final thread that
>>>> releases the Writer either. Instead I will grab a thread from a thread
>>>>        
>> pool
>>    
>>>> to do the reopening with that thread, and return right after closing
>>>>        
>> the
>>    
>>>> Writer. The result is that you cannot add a doc and search and expect
>>>>        
>> to
>>    
>>>> find it without waiting a second or too. But this way things will be
>>>> consistent, and an app that adds docs will be a bit more
>>>>        
>> responsive....eg it
>>    
>>>> wont hang as Readers are being reopened.
>>>>
>>>> I also have to bring the AccessProvider classes back. No easy way to
>>>>        
>> use
>>    
>>>> your own custom Readers without it...I shouldn't have stripped it out.
>>>>
>>>> - Mark
>>>>
>>>>
>>>>
>>>> Cam Bazz wrote:
>>>>
>>>>        
>>>>> Hello,
>>>>>
>>>>> Regarding https://issues.apache.org/jira/browse/LUCENE-1026 , this
>>>>>          
>> seems
>>    
>>>>> very interesting. I have read the discussion on the page, but I could
>>>>>
>>>>>          
>>>> not
>>>>
>>>>        
>>>>> figure out which set of files is the latest.
>>>>> Is it the IndexAccessor-1.26.2008.zip file?
>>>>>
>>>>> I will read through the code, make my own tests, and send some
>>>>>          
>> feedback.
>>    
>>>>> Best.
>>>>> -C.B.
>>>>>
>>>>>
>>>>>
>>>>>          
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [hidden email]
>>>> For additional commands, e-mail: [hidden email]
>>>>
>>>>
>>>>
>>>>        
>>>      
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>>    
>
>  

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

Mark Miller-3
In reply to this post by Cam Bazz
I replied to the wrong thread -- sorry about that:

You still have to be careful if you want to alternate a search and
write. If you are loading a lot of docs this way, you would want to hold
the Writer to batch the docs, but while you are holding it, you will not
have a fresh view of the index - so you could add the same doc twice if
it came twice in a batch. The only way to be sure you avoid this is to
reopen readers after you add every doc. This is just not going to be a
fast way of doing things...but if you have a high mergefactor, the new
reopen method will prob make it *much* faster. Or if you are sure that
the batch won't contain duplicates, you can batch load.

Cam Bazz wrote:

> Hello Mark,
>
> Thank you for your lengthy and valuable clarification. I have the case -
> before adding to the index, i must check if a document exist with the
> same key (actually, double key) - or before deleting a document - I must
> ensure it exists in the index.
>
> Currently I am doing it with my custom caching routine. It works quite well
> upto 32M documents. but after that something happens and it really slows
> down.
>
> I will experiment with your implementation, as soon as I can. It is very
> cool by the way. Will it be included in the next release?
>
> Best,
> -C.B.
>
> On Feb 4, 2008 7:15 PM, Mark Miller <[hidden email]> wrote:
>
>  
>> The purpose of IndexAccessor is to coordinate Readers/Writers for a
>> Lucene index. Readers and Writers in Lucene are multi-threaded in that
>> multiple threads may use them at the same time, but they must/should be
>> shared and there are special rules (You cannot delete with a Reader
>> while a Writer is working on the index). Also, you need to refresh
>> Reader views every so often; this is expensive (though usually much less
>> so with the new reopen method).
>>
>> IndexAccessor enforces the rules and controls Reader refreshing. Instead
>> of worrying about caching or index interaction rules, you just ask for
>> your Reader/Writer, use it to search or add a doc, and then return it.
>> The rest is taken care of for you.
>>
>> This is done by keeping a cached Writer and Searcher(s) that all threads
>> share. References to the Searchers are counted so that after a Writer is
>> returned (and no other thread has a reference to the Writer),
>> IndexAccessor waits for all of the current Searchers to come back and
>> then reopens their Readers.
>>
>> In this regard, you get a  similar setup to what Solr might give: from
>> any thread you just add docs and run searches -- you don't have to worry
>> about refreshing Readers or sharing Writers/Readers or one thread
>> deleting with a Reader while another thread tries to write with a Writer.
>>
>> This setup allows you to do other cool things, like warm Searchers
>> before putting them into action. Thats what the code I am posting soon
>> is be capable of - when the Readers are reopened, search requests will
>> still be handled by the old Readers while the new Searchers run a sample
>> query with optional sort fields. This will make sure the Reader is open
>> and its sort caches are loaded before the first thread tries to use it.
>> Much faster response to applications.
>>
>> You must  open a new Reader or reopen a Reader to see recently added
>> docs...IndexAccessor provides no real way around that. But it does make
>> the reopening much easier -- and your application that just wants to add
>> docs and search at will from multiple threads, won't have to worry about
>> it.
>>
>> You can bail out here, or if you want further clarification I will
>> include an alternate attempt at what IndexAccessor is below.
>>
>> - Mark
>>
>>
>> ----------------------------------------------------------------------------------------------------
>> When accessing a Lucene index from multiple threads, there are a variety
>> of issues that you must address.
>>
>> 1. The Readers/Writer should be shared across threads.
>> 2. Readers must periodically be refreshed, either be creating new
>> instances or using the new reopen method.
>> 3. A Reader that writes needs to be properly coordinated with a Writer
>> eg they cannot be used at the same time.
>>
>> IndexAccessor addresses each of these issues.
>>
>> How it works:
>>
>> A single Writer is shared among threads that try to concurrently
>> retrieve and use a Writer. Once all of these threads release their
>> reference
>> to the Writer, it is closed and upon the next request a new one is
>> created.
>>
>> A single Searcher for each Similarity is also shared across threads.
>> Upon first request, a new Searcher is created. This Searcher is then
>> returned
>> upon every request. A count of every Searcher reference retrieved is
>> maintained.
>>
>> When all references to a Writer are released, the Writer is closed and
>> after waiting for all of the Searchers to be returned, the Searchers are
>> reopened. Without warming enabled, new requests for Searchers/Readers
>> must wait for this reopen to complete. If warming is enabled, the old
>> Searchers/Readers continue handling Searcher requests until the Readers
>> have been reopened and any requested sort caches have been loaded.
>>
>> If you ask for a writing Reader, you will not get it until a Writer is
>> released and vice versa.
>>
>> The result is that you can freely use Writers/Readers/Searchers from any
>> thread without considering thread interactions. ***
>>
>> If you want to add docs, just ask for a Writer, add the docs, and
>> release the Writer. If you want to search, get a Searcher, search,
>> and release the Searcher. You don't have to worry about reopening
>> Readers or coordinating access.
>>
>>
>> ***
>> You still do have to consider things like hogging the Writer/Readers -
>> if you don't occasionally release them, things will not stay very
>> interactive.
>> The best method is to just get the object, use it, and then return it in
>> a finally block. Batch load multiple docs, but if your just randomly
>> adding
>> a doc, get the Writer, add it, and then release the Writer in a finally
>> block. If you are batch loading a million docs and you want to be able
>> to see them
>> as they are added: get the writer and add 10,000 docs (or something),
>> release the Writer, get the Writer and add 10,000 docs, etc.
>>
>> Cam Bazz wrote:
>>    
>>> Hello Mark,
>>>
>>> I have been reading the code - and honestly I have not understood how it
>>> works. I was hoping that this was a solution to the case when you are
>>>      
>> adding
>>    
>>> documents - in a multithreaded way, it allows other non-writer threads
>>>      
>> to be
>>    
>>> able to see documents added without refreshing the indexsearcher - by
>>>      
>> using
>>    
>>> some caching mechanism.
>>>
>>> Could you elaborate what IndexAccessor does and how it does it a little
>>>      
>> bit
>>    
>>> more?
>>>
>>> Best Regards,
>>> -C.B.
>>>
>>> On Feb 4, 2008 3:06 PM, Mark Miller <[hidden email]> wrote:
>>>
>>>
>>>      
>>>> IndexAccessor-1.26.2008.zip is the latest one. I will be dating a zip
>>>>        
>> from
>>    
>>>> now on.
>>>>
>>>> I hope to post new code with the warming either tonight or tomorrow
>>>>        
>> night.
>>    
>>>> I would be ecstatic to have some help vetting that.
>>>>
>>>> Also, I am thinking of making a change so that when you release the
>>>>        
>> Writer
>>    
>>>> the thread that releases does not block until reopen. I think the
>>>>        
>> original
>>    
>>>> author did this so that if you add a doc with a thread and then
>>>>        
>> immediately
>>    
>>>> search from the same thread, you are guaranteed to find the doc.
>>>>        
>> However,
>>    
>>>> this gaurentee did not hold -- if another thread had a reference to the
>>>> Writer and a new thread grabbed a Writer and then quicly released
>>>>        
>> before the
>>    
>>>> first thread, you will have added a doc but it will not be visible
>>>>        
>> until the
>>    
>>>> first thread releases its reference to the Writer...since the concept
>>>>        
>> is not
>>    
>>>> enforced anyway, you might as well not block for the final thread that
>>>> releases the Writer either. Instead I will grab a thread from a thread
>>>>        
>> pool
>>    
>>>> to do the reopening with that thread, and return right after closing
>>>>        
>> the
>>    
>>>> Writer. The result is that you cannot add a doc and search and expect
>>>>        
>> to
>>    
>>>> find it without waiting a second or too. But this way things will be
>>>> consistent, and an app that adds docs will be a bit more
>>>>        
>> responsive....eg it
>>    
>>>> wont hang as Readers are being reopened.
>>>>
>>>> I also have to bring the AccessProvider classes back. No easy way to
>>>>        
>> use
>>    
>>>> your own custom Readers without it...I shouldn't have stripped it out.
>>>>
>>>> - Mark
>>>>
>>>>
>>>>
>>>> Cam Bazz wrote:
>>>>
>>>>        
>>>>> Hello,
>>>>>
>>>>> Regarding https://issues.apache.org/jira/browse/LUCENE-1026 , this
>>>>>          
>> seems
>>    
>>>>> very interesting. I have read the discussion on the page, but I could
>>>>>
>>>>>          
>>>> not
>>>>
>>>>        
>>>>> figure out which set of files is the latest.
>>>>> Is it the IndexAccessor-1.26.2008.zip file?
>>>>>
>>>>> I will read through the code, make my own tests, and send some
>>>>>          
>> feedback.
>>    
>>>>> Best.
>>>>> -C.B.
>>>>>
>>>>>
>>>>>
>>>>>          
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [hidden email]
>>>> For additional commands, e-mail: [hidden email]
>>>>
>>>>
>>>>
>>>>        
>>>      
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>>    
>
>  

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

Jay-98
In reply to this post by Mark Miller-3
Great effort for much improved indexaccessor, Mark!
A couple questions and observations:

1. In release(Searcher), you removed a check if the given searcher is
the cached one from an earlier version. This could potentially  cause
problems for some people.
2. The createdSearchers variable is not really used: you just populate
it and print it out. What's the purpose for it?
3. The variable numSearchersForRetirment is  used in
WarmingIndexAccessor not DefaultIndexAccessor.
4. I wish that in the next release of Lucene, they will add searcher
reopen api so that we do not have to wor around it.
5. Although currently IndexSearcher.close() does almost nothing except
to close the internal index reader, it might be a safer to close
searcher itself as well in closeCachedSearcher(), just in case, the
searcher may have other resources to release in the future version of
Lucene.

Thanks!

Jay
Mark Miller wrote:

> For anyone following this thread who would like to check this out, I put
> up the new code with the warming capability:
>
> https://issues.apache.org/jira/browse/LUCENE-1026
> <https://issues.apache.org/jira/secure/attachment/12374729/IndexAccessor-02.04.2008.zip>
> IndexAccessor-02.04.2008.zip
> <https://issues.apache.org/jira/secure/attachment/12374729/IndexAccessor-02.04.2008.zip>
> (32 kb)
>
> See the comment at the bottom.
>
> Cam Bazz wrote:
>> Hello Mark,
>>
>> Thank you for your lengthy and valuable clarification. I have the case -
>> before adding to the index, i must check if a document exist with the
>> same key (actually, double key) - or before deleting a document - I must
>> ensure it exists in the index.
>>
>> Currently I am doing it with my custom caching routine. It works quite
>> well
>> upto 32M documents. but after that something happens and it really slows
>> down.
>>
>> I will experiment with your implementation, as soon as I can. It is very
>> cool by the way. Will it be included in the next release?
>>
>> Best,
>> -C.B.
>>
>> On Feb 4, 2008 7:15 PM, Mark Miller <[hidden email]> wrote:
>>
>>  
>>> The purpose of IndexAccessor is to coordinate Readers/Writers for a
>>> Lucene index. Readers and Writers in Lucene are multi-threaded in that
>>> multiple threads may use them at the same time, but they must/should be
>>> shared and there are special rules (You cannot delete with a Reader
>>> while a Writer is working on the index). Also, you need to refresh
>>> Reader views every so often; this is expensive (though usually much less
>>> so with the new reopen method).
>>>
>>> IndexAccessor enforces the rules and controls Reader refreshing. Instead
>>> of worrying about caching or index interaction rules, you just ask for
>>> your Reader/Writer, use it to search or add a doc, and then return it.
>>> The rest is taken care of for you.
>>>
>>> This is done by keeping a cached Writer and Searcher(s) that all threads
>>> share. References to the Searchers are counted so that after a Writer is
>>> returned (and no other thread has a reference to the Writer),
>>> IndexAccessor waits for all of the current Searchers to come back and
>>> then reopens their Readers.
>>>
>>> In this regard, you get a  similar setup to what Solr might give: from
>>> any thread you just add docs and run searches -- you don't have to worry
>>> about refreshing Readers or sharing Writers/Readers or one thread
>>> deleting with a Reader while another thread tries to write with a
>>> Writer.
>>>
>>> This setup allows you to do other cool things, like warm Searchers
>>> before putting them into action. Thats what the code I am posting soon
>>> is be capable of - when the Readers are reopened, search requests will
>>> still be handled by the old Readers while the new Searchers run a sample
>>> query with optional sort fields. This will make sure the Reader is open
>>> and its sort caches are loaded before the first thread tries to use it.
>>> Much faster response to applications.
>>>
>>> You must  open a new Reader or reopen a Reader to see recently added
>>> docs...IndexAccessor provides no real way around that. But it does make
>>> the reopening much easier -- and your application that just wants to add
>>> docs and search at will from multiple threads, won't have to worry about
>>> it.
>>>
>>> You can bail out here, or if you want further clarification I will
>>> include an alternate attempt at what IndexAccessor is below.
>>>
>>> - Mark
>>>
>>>
>>> ----------------------------------------------------------------------------------------------------
>>>
>>> When accessing a Lucene index from multiple threads, there are a variety
>>> of issues that you must address.
>>>
>>> 1. The Readers/Writer should be shared across threads.
>>> 2. Readers must periodically be refreshed, either be creating new
>>> instances or using the new reopen method.
>>> 3. A Reader that writes needs to be properly coordinated with a Writer
>>> eg they cannot be used at the same time.
>>>
>>> IndexAccessor addresses each of these issues.
>>>
>>> How it works:
>>>
>>> A single Writer is shared among threads that try to concurrently
>>> retrieve and use a Writer. Once all of these threads release their
>>> reference
>>> to the Writer, it is closed and upon the next request a new one is
>>> created.
>>>
>>> A single Searcher for each Similarity is also shared across threads.
>>> Upon first request, a new Searcher is created. This Searcher is then
>>> returned
>>> upon every request. A count of every Searcher reference retrieved is
>>> maintained.
>>>
>>> When all references to a Writer are released, the Writer is closed and
>>> after waiting for all of the Searchers to be returned, the Searchers are
>>> reopened. Without warming enabled, new requests for Searchers/Readers
>>> must wait for this reopen to complete. If warming is enabled, the old
>>> Searchers/Readers continue handling Searcher requests until the Readers
>>> have been reopened and any requested sort caches have been loaded.
>>>
>>> If you ask for a writing Reader, you will not get it until a Writer is
>>> released and vice versa.
>>>
>>> The result is that you can freely use Writers/Readers/Searchers from any
>>> thread without considering thread interactions. ***
>>>
>>> If you want to add docs, just ask for a Writer, add the docs, and
>>> release the Writer. If you want to search, get a Searcher, search,
>>> and release the Searcher. You don't have to worry about reopening
>>> Readers or coordinating access.
>>>
>>>
>>> ***
>>> You still do have to consider things like hogging the Writer/Readers -
>>> if you don't occasionally release them, things will not stay very
>>> interactive.
>>> The best method is to just get the object, use it, and then return it in
>>> a finally block. Batch load multiple docs, but if your just randomly
>>> adding
>>> a doc, get the Writer, add it, and then release the Writer in a finally
>>> block. If you are batch loading a million docs and you want to be able
>>> to see them
>>> as they are added: get the writer and add 10,000 docs (or something),
>>> release the Writer, get the Writer and add 10,000 docs, etc.
>>>
>>> Cam Bazz wrote:
>>>    
>>>> Hello Mark,
>>>>
>>>> I have been reading the code - and honestly I have not understood
>>>> how it
>>>> works. I was hoping that this was a solution to the case when you are
>>>>      
>>> adding
>>>    
>>>> documents - in a multithreaded way, it allows other non-writer threads
>>>>      
>>> to be
>>>    
>>>> able to see documents added without refreshing the indexsearcher - by
>>>>      
>>> using
>>>    
>>>> some caching mechanism.
>>>>
>>>> Could you elaborate what IndexAccessor does and how it does it a little
>>>>      
>>> bit
>>>    
>>>> more?
>>>>
>>>> Best Regards,
>>>> -C.B.
>>>>
>>>> On Feb 4, 2008 3:06 PM, Mark Miller <[hidden email]> wrote:
>>>>
>>>>
>>>>      
>>>>> IndexAccessor-1.26.2008.zip is the latest one. I will be dating a zip
>>>>>        
>>> from
>>>    
>>>>> now on.
>>>>>
>>>>> I hope to post new code with the warming either tonight or tomorrow
>>>>>        
>>> night.
>>>    
>>>>> I would be ecstatic to have some help vetting that.
>>>>>
>>>>> Also, I am thinking of making a change so that when you release the
>>>>>        
>>> Writer
>>>    
>>>>> the thread that releases does not block until reopen. I think the
>>>>>        
>>> original
>>>    
>>>>> author did this so that if you add a doc with a thread and then
>>>>>        
>>> immediately
>>>    
>>>>> search from the same thread, you are guaranteed to find the doc.
>>>>>        
>>> However,
>>>    
>>>>> this gaurentee did not hold -- if another thread had a reference to
>>>>> the
>>>>> Writer and a new thread grabbed a Writer and then quicly released
>>>>>        
>>> before the
>>>    
>>>>> first thread, you will have added a doc but it will not be visible
>>>>>        
>>> until the
>>>    
>>>>> first thread releases its reference to the Writer...since the concept
>>>>>        
>>> is not
>>>    
>>>>> enforced anyway, you might as well not block for the final thread that
>>>>> releases the Writer either. Instead I will grab a thread from a thread
>>>>>        
>>> pool
>>>    
>>>>> to do the reopening with that thread, and return right after closing
>>>>>        
>>> the
>>>    
>>>>> Writer. The result is that you cannot add a doc and search and expect
>>>>>        
>>> to
>>>    
>>>>> find it without waiting a second or too. But this way things will be
>>>>> consistent, and an app that adds docs will be a bit more
>>>>>        
>>> responsive....eg it
>>>    
>>>>> wont hang as Readers are being reopened.
>>>>>
>>>>> I also have to bring the AccessProvider classes back. No easy way to
>>>>>        
>>> use
>>>    
>>>>> your own custom Readers without it...I shouldn't have stripped it out.
>>>>>
>>>>> - Mark
>>>>>
>>>>>
>>>>>
>>>>> Cam Bazz wrote:
>>>>>
>>>>>        
>>>>>> Hello,
>>>>>>
>>>>>> Regarding https://issues.apache.org/jira/browse/LUCENE-1026 , this
>>>>>>          
>>> seems
>>>    
>>>>>> very interesting. I have read the discussion on the page, but I could
>>>>>>
>>>>>>          
>>>>> not
>>>>>
>>>>>        
>>>>>> figure out which set of files is the latest.
>>>>>> Is it the IndexAccessor-1.26.2008.zip file?
>>>>>>
>>>>>> I will read through the code, make my own tests, and send some
>>>>>>          
>>> feedback.
>>>    
>>>>>> Best.
>>>>>> -C.B.
>>>>>>
>>>>>>
>>>>>>
>>>>>>          
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [hidden email]
>>>>> For additional commands, e-mail: [hidden email]
>>>>>
>>>>>
>>>>>
>>>>>        
>>>>      
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>>
>>>    
>>
>>  
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

Mark Miller-3
Thanks for the feedback jay. One at a time:

Jay wrote:
> Great effort for much improved indexaccessor, Mark!
> A couple questions and observations:
>
> 1. In release(Searcher), you removed a check if the given searcher is
> the cached one from an earlier version. This could potentially  cause
> problems for some people.
This is something that I meant to come back to. The problem is that the
Searcher you are returning may have already been replaced in the
Searcher cache...so retired searchers must be checked too. Will consider
again.
> 2. The createdSearchers variable is not really used: you just populate
> it and print it out. What's the purpose for it?
This was a debug check that I had been using. Will cleanup.
> 3. The variable numSearchersForRetirment is  used in
> WarmingIndexAccessor not DefaultIndexAccessor.
Thanks. Will move.
> 4. I wish that in the next release of Lucene, they will add searcher
> reopen api so that we do not have to wor around it.
Not sure how this would play out, but an interesting thought...
> 5. Although currently IndexSearcher.close() does almost nothing except
> to close the internal index reader, it might be a safer to close
> searcher itself as well in closeCachedSearcher(), just in case, the
> searcher may have other resources to release in the future version of
> Lucene.
The problem is that because I am supplying the reader, calling
Searcher.close() won't close it. The Searcher has to be created without
a supplied Reader for it to be able to close itself. I got the same itch
though...
>
> Thanks!
>
> Jay
I appreciate the feedback! I'll be working more on this. I think more
can be done.

- Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

Mark Miller-3
In reply to this post by Jay-98

>  
> 5. Although currently IndexSearcher.close() does almost nothing except
> to close the internal index reader, it might be a safer to close
> searcher itself as well in closeCachedSearcher(), just in case, the
> searcher may have other resources to release in the future version of
> Lucene.
Didn't catch that "as well". You are right, great idea Jay, thanks.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

Jay-98
Thanks for your clarifications, Mark!


Jay

Mark Miller wrote:

>
>>  
>> 5. Although currently IndexSearcher.close() does almost nothing except
>> to close the internal index reader, it might be a safer to close
>> searcher itself as well in closeCachedSearcher(), just in case, the
>> searcher may have other resources to release in the future version of
>> Lucene.
> Didn't catch that "as well". You are right, great idea Jay, thanks.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

vivek sar
Mark,

   There seems to be some issue with DefaultMultiIndexAccessor.java. I
got following NPE exception,

     2008-02-13 07:10:28,021 ERROR [http-7501-Processor6] ReportServiceImpl -
java.lang.NullPointerException
        at org.apache.lucene.indexaccessor.DefaultMultiIndexAccessor.release(DefaultMultiIndexAccessor.java:89)

Looks like the IndexAccessor for one of the Searcher in the
MultiSearcher returned null. Not sure how is that possible, any ideas
how is that possible?

In my case it caused a critical error as the writer thread was stuck
forever (we found out after couple of days) because of this,

"PS thread 9" prio=1 tid=0x00002aac70eb95d0 nid=0x6ba in Object.wait()
[0x0000000047533000..0x0000000047533b80]
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00002aab3e5c7700> (a
org.apache.lucene.indexaccessor.DefaultIndexAccessor)
        at java.lang.Object.wait(Unknown Source)
        at org.apache.lucene.indexaccessor.DefaultIndexAccessor.waitForReadersAndCloseCached(DefaultIndexAccessor.java:593)
        at org.apache.lucene.indexaccessor.DefaultIndexAccessor.release(DefaultIndexAccessor.java:510)
        - locked <0x00002aab3e5c7700> (a
org.apache.lucene.indexaccessor.DefaultIndexAccessor)

The only way to recover was to re-start the application.

I use both MultiSearcher and IndexSearcher in my application, I've
looked at your code but not able to pinpoint how can it go wrong? Of
course, you do have to check for null in the
MultiIndexAccessor.release, but how could you get null index accessor
at first place?

I do call IndexAccessor.close during partitioning of indexes, but the
close should wait for all Searchers to close before doing anything.

Do you have any updates to your code since 02/04/2008?

Thanks,
-vivek

On Feb 6, 2008 8:37 AM, Jay <[hidden email]> wrote:

> Thanks for your clarifications, Mark!
>
>
> Jay
>
>
> Mark Miller wrote:
> >
> >>
> >> 5. Although currently IndexSearcher.close() does almost nothing except
> >> to close the internal index reader, it might be a safer to close
> >> searcher itself as well in closeCachedSearcher(), just in case, the
> >> searcher may have other resources to release in the future version of
> >> Lucene.
> > Didn't catch that "as well". You are right, great idea Jay, thanks.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

Mark Miller-3
Hey vivek, sorry to hear you are having problems.

I am trying to figure out how you may be seeing this problem. The
IndexAccessor cannot return null because you would get an
IllegalStateException not a NullPointerException. Also, the released
MultiSearcher cannot be null because the Exception would have been
thrown sooner. Releasing a null Searcher throws no Exception. So a
possibility is that you are returning a foreign MultiSearcher?

Unlikely, but I don't see anything else at the moment.

The MultiSearcher code is really pretty simple and actually recreates a
MultiSearcher on every request...it did not appear to be worth it to
coordinate closed sub Accessors with a cache for the MultiSearcher (I
wrote the code at one point, and later got rid of it). So really the
MultiSearcher is just a simple class that gets cached sub Searchers for
each index and creates a one time use MultiSearcher. A simple cache is
kept around that identifies which Accessor needs to release which sub
Searcher. It's all rather simple, and I am struggling to see another
possibility beyond returning a foreign MultiSearcher somehow.

I will keep looking and keep you posted. In the mean time, do you have
any other data or code snippets to share?

vivek sar wrote:

> Mark,
>
>    There seems to be some issue with DefaultMultiIndexAccessor.java. I
> got following NPE exception,
>
>      2008-02-13 07:10:28,021 ERROR [http-7501-Processor6] ReportServiceImpl -
> java.lang.NullPointerException
>         at org.apache.lucene.indexaccessor.DefaultMultiIndexAccessor.release(DefaultMultiIndexAccessor.java:89)
>
> Looks like the IndexAccessor for one of the Searcher in the
> MultiSearcher returned null. Not sure how is that possible, any ideas
> how is that possible?
>
> In my case it caused a critical error as the writer thread was stuck
> forever (we found out after couple of days) because of this,
>
> "PS thread 9" prio=1 tid=0x00002aac70eb95d0 nid=0x6ba in Object.wait()
> [0x0000000047533000..0x0000000047533b80]
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00002aab3e5c7700> (a
> org.apache.lucene.indexaccessor.DefaultIndexAccessor)
>         at java.lang.Object.wait(Unknown Source)
>         at org.apache.lucene.indexaccessor.DefaultIndexAccessor.waitForReadersAndCloseCached(DefaultIndexAccessor.java:593)
>         at org.apache.lucene.indexaccessor.DefaultIndexAccessor.release(DefaultIndexAccessor.java:510)
>         - locked <0x00002aab3e5c7700> (a
> org.apache.lucene.indexaccessor.DefaultIndexAccessor)
>
> The only way to recover was to re-start the application.
>
> I use both MultiSearcher and IndexSearcher in my application, I've
> looked at your code but not able to pinpoint how can it go wrong? Of
> course, you do have to check for null in the
> MultiIndexAccessor.release, but how could you get null index accessor
> at first place?
>
> I do call IndexAccessor.close during partitioning of indexes, but the
> close should wait for all Searchers to close before doing anything.
>
> Do you have any updates to your code since 02/04/2008?
>
> Thanks,
> -vivek
>
> On Feb 6, 2008 8:37 AM, Jay <[hidden email]> wrote:
>  
>> Thanks for your clarifications, Mark!
>>
>>
>> Jay
>>
>>
>> Mark Miller wrote:
>>    
>>>> 5. Although currently IndexSearcher.close() does almost nothing except
>>>> to close the internal index reader, it might be a safer to close
>>>> searcher itself as well in closeCachedSearcher(), just in case, the
>>>> searcher may have other resources to release in the future version of
>>>> Lucene.
>>>>        
>>> Didn't catch that "as well". You are right, great idea Jay, thanks.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>      
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>>    
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
>  

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

Mark Miller-3
In reply to this post by vivek sar
Okay, sorry about this one vivek. Added to the unit tests to expose
this. When I took out the MultiSearcher caching, I kept the concept of
sharing a single MultiIndexAccessor. Unfortunately, this meant that
multiple threads were sharing the same Searcher to Accessor Map that was
used to track which Accessor needs to release which Searcher. Because of
this, a thread might come along and pull a Searcher out of that Map
right before another Thread tries to release that same cached Searcher
instance. The result is that it is not there, and hence the
NullPointerException.

Nice to have this added to the Unit Tests. The fix is to create a new
MultiIndexAccessor on every request, and recommend you get one for each
Thread as the class is now not thread safe. Construction of a
MultiIndexAccessor is pretty much nothing in terms of time. This way
each thread has its own Map of Searcher to Accessors.

A good tip is to make a simple page that simply prints out the
Searcher/Writer use counts. You can then check this page occasionally
and see if it appears there are Writers or Searchers stuck out you know
you have a problem. Under normal circumstances it should not be possible
and indicates a bug somewhere.

I will the fix shortly.

- Mark

vivek sar wrote:

> Mark,
>
>    There seems to be some issue with DefaultMultiIndexAccessor.java. I
> got following NPE exception,
>
>      2008-02-13 07:10:28,021 ERROR [http-7501-Processor6] ReportServiceImpl -
> java.lang.NullPointerException
>         at org.apache.lucene.indexaccessor.DefaultMultiIndexAccessor.release(DefaultMultiIndexAccessor.java:89)
>
> Looks like the IndexAccessor for one of the Searcher in the
> MultiSearcher returned null. Not sure how is that possible, any ideas
> how is that possible?
>
> In my case it caused a critical error as the writer thread was stuck
> forever (we found out after couple of days) because of this,
>
> "PS thread 9" prio=1 tid=0x00002aac70eb95d0 nid=0x6ba in Object.wait()
> [0x0000000047533000..0x0000000047533b80]
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00002aab3e5c7700> (a
> org.apache.lucene.indexaccessor.DefaultIndexAccessor)
>         at java.lang.Object.wait(Unknown Source)
>         at org.apache.lucene.indexaccessor.DefaultIndexAccessor.waitForReadersAndCloseCached(DefaultIndexAccessor.java:593)
>         at org.apache.lucene.indexaccessor.DefaultIndexAccessor.release(DefaultIndexAccessor.java:510)
>         - locked <0x00002aab3e5c7700> (a
> org.apache.lucene.indexaccessor.DefaultIndexAccessor)
>
> The only way to recover was to re-start the application.
>
> I use both MultiSearcher and IndexSearcher in my application, I've
> looked at your code but not able to pinpoint how can it go wrong? Of
> course, you do have to check for null in the
> MultiIndexAccessor.release, but how could you get null index accessor
> at first place?
>
> I do call IndexAccessor.close during partitioning of indexes, but the
> close should wait for all Searchers to close before doing anything.
>
> Do you have any updates to your code since 02/04/2008?
>
> Thanks,
> -vivek
>
> On Feb 6, 2008 8:37 AM, Jay <[hidden email]> wrote:
>  
>> Thanks for your clarifications, Mark!
>>
>>
>> Jay
>>
>>
>> Mark Miller wrote:
>>    
>>>> 5. Although currently IndexSearcher.close() does almost nothing except
>>>> to close the internal index reader, it might be a safer to close
>>>> searcher itself as well in closeCachedSearcher(), just in case, the
>>>> searcher may have other resources to release in the future version of
>>>> Lucene.
>>>>        
>>> Didn't catch that "as well". You are right, great idea Jay, thanks.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>      
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>>    
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
>  

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

Mark Miller-3
In reply to this post by vivek sar
Here is the fix: https://issues.apache.org/jira/browse/LUCENE-1026

vivek sar wrote:

> Mark,
>
>    There seems to be some issue with DefaultMultiIndexAccessor.java. I
> got following NPE exception,
>
>      2008-02-13 07:10:28,021 ERROR [http-7501-Processor6] ReportServiceImpl -
> java.lang.NullPointerException
>         at org.apache.lucene.indexaccessor.DefaultMultiIndexAccessor.release(DefaultMultiIndexAccessor.java:89)
>
> Looks like the IndexAccessor for one of the Searcher in the
> MultiSearcher returned null. Not sure how is that possible, any ideas
> how is that possible?
>
> In my case it caused a critical error as the writer thread was stuck
> forever (we found out after couple of days) because of this,
>
> "PS thread 9" prio=1 tid=0x00002aac70eb95d0 nid=0x6ba in Object.wait()
> [0x0000000047533000..0x0000000047533b80]
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00002aab3e5c7700> (a
> org.apache.lucene.indexaccessor.DefaultIndexAccessor)
>         at java.lang.Object.wait(Unknown Source)
>         at org.apache.lucene.indexaccessor.DefaultIndexAccessor.waitForReadersAndCloseCached(DefaultIndexAccessor.java:593)
>         at org.apache.lucene.indexaccessor.DefaultIndexAccessor.release(DefaultIndexAccessor.java:510)
>         - locked <0x00002aab3e5c7700> (a
> org.apache.lucene.indexaccessor.DefaultIndexAccessor)
>
> The only way to recover was to re-start the application.
>
> I use both MultiSearcher and IndexSearcher in my application, I've
> looked at your code but not able to pinpoint how can it go wrong? Of
> course, you do have to check for null in the
> MultiIndexAccessor.release, but how could you get null index accessor
> at first place?
>
> I do call IndexAccessor.close during partitioning of indexes, but the
> close should wait for all Searchers to close before doing anything.
>
> Do you have any updates to your code since 02/04/2008?
>
> Thanks,
> -vivek
>
> On Feb 6, 2008 8:37 AM, Jay <[hidden email]> wrote:
>  
>> Thanks for your clarifications, Mark!
>>
>>
>> Jay
>>
>>
>> Mark Miller wrote:
>>    
>>>> 5. Although currently IndexSearcher.close() does almost nothing except
>>>> to close the internal index reader, it might be a safer to close
>>>> searcher itself as well in closeCachedSearcher(), just in case, the
>>>> searcher may have other resources to release in the future version of
>>>> Lucene.
>>>>        
>>> Didn't catch that "as well". You are right, great idea Jay, thanks.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>      
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>>    
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
>  

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

vivek sar
In reply to this post by Mark Miller-3
Mark,

  Here is the scenario when I saw this exception,

1) A search was run which uses MultiSearcher. This search took more
than 3 mins to complete (due to index size and multiple indices)
2) Just a minute after the search was started, we started writing (in
a separate thread) to one of the index which was searched on
3) Writer finished in few seconds and called writer.release
4) Since, the multi-searcher was still running the writer.release
waits for all multisearcher to complete
5) The MultiSearcher finally completes and calls
MultiIndexAccessor.release for the multisearcher
6) At this point for some reason the multisearcher throws NPE. I'm not
sure whether the NPE was on the index that was being updated by writer
or some other index
7) Because of NPE the index that the writer was writing to never gets
released and Writer gets stuck

I haven't been able to reproduce this again.

Is IndexAccessor-02.07.2008.zip
(https://issues.apache.org/jira/browse/LUCENE-1026) most up to date
code you got? You mentioned the new jar should,

"Releasing a Writer never blocks for a reopen now - so after adding a
doc it may be a second or two before its visible to new Searchers"

 Do you think this would help the case I ran into where the writer was
stuck because of the searcher release?

 I think you may still want to check for null in the
MultiIndexAccessor.release(), i.e.,

    public synchronized void release(Searcher multiSearcher) {
    Searchable[] searchers = ((MultiSearcher) multiSearcher).getSearchables();
    IndexAccessor accessor = null;
    for (Searchable searchable : searchers) {
      if(searchable != null){
         accessor = multiSearcherAccessors.remove(searchable);
         if(accessor != null){
               accessor.release((Searcher) searchable);
         }
      }
    }
}

Thanks,
-vivek

On Feb 15, 2008 5:02 PM, Mark Miller <[hidden email]> wrote:

> Hey vivek, sorry to hear you are having problems.
>
> I am trying to figure out how you may be seeing this problem. The
> IndexAccessor cannot return null because you would get an
> IllegalStateException not a NullPointerException. Also, the released
> MultiSearcher cannot be null because the Exception would have been
> thrown sooner. Releasing a null Searcher throws no Exception. So a
> possibility is that you are returning a foreign MultiSearcher?
>
> Unlikely, but I don't see anything else at the moment.
>
> The MultiSearcher code is really pretty simple and actually recreates a
> MultiSearcher on every request...it did not appear to be worth it to
> coordinate closed sub Accessors with a cache for the MultiSearcher (I
> wrote the code at one point, and later got rid of it). So really the
> MultiSearcher is just a simple class that gets cached sub Searchers for
> each index and creates a one time use MultiSearcher. A simple cache is
> kept around that identifies which Accessor needs to release which sub
> Searcher. It's all rather simple, and I am struggling to see another
> possibility beyond returning a foreign MultiSearcher somehow.
>
> I will keep looking and keep you posted. In the mean time, do you have
> any other data or code snippets to share?
>
>
> vivek sar wrote:
> > Mark,
> >
> >    There seems to be some issue with DefaultMultiIndexAccessor.java. I
> > got following NPE exception,
> >
> >      2008-02-13 07:10:28,021 ERROR [http-7501-Processor6] ReportServiceImpl -
> > java.lang.NullPointerException
> >         at org.apache.lucene.indexaccessor.DefaultMultiIndexAccessor.release(DefaultMultiIndexAccessor.java:89)
> >
> > Looks like the IndexAccessor for one of the Searcher in the
> > MultiSearcher returned null. Not sure how is that possible, any ideas
> > how is that possible?
> >
> > In my case it caused a critical error as the writer thread was stuck
> > forever (we found out after couple of days) because of this,
> >
> > "PS thread 9" prio=1 tid=0x00002aac70eb95d0 nid=0x6ba in Object.wait()
> > [0x0000000047533000..0x0000000047533b80]
> >         at java.lang.Object.wait(Native Method)
> >         - waiting on <0x00002aab3e5c7700> (a
> > org.apache.lucene.indexaccessor.DefaultIndexAccessor)
> >         at java.lang.Object.wait(Unknown Source)
> >         at org.apache.lucene.indexaccessor.DefaultIndexAccessor.waitForReadersAndCloseCached(DefaultIndexAccessor.java:593)
> >         at org.apache.lucene.indexaccessor.DefaultIndexAccessor.release(DefaultIndexAccessor.java:510)
> >         - locked <0x00002aab3e5c7700> (a
> > org.apache.lucene.indexaccessor.DefaultIndexAccessor)
> >
> > The only way to recover was to re-start the application.
> >
> > I use both MultiSearcher and IndexSearcher in my application, I've
> > looked at your code but not able to pinpoint how can it go wrong? Of
> > course, you do have to check for null in the
> > MultiIndexAccessor.release, but how could you get null index accessor
> > at first place?
> >
> > I do call IndexAccessor.close during partitioning of indexes, but the
> > close should wait for all Searchers to close before doing anything.
> >
> > Do you have any updates to your code since 02/04/2008?
> >
> > Thanks,
> > -vivek
> >
> > On Feb 6, 2008 8:37 AM, Jay <[hidden email]> wrote:
> >
> >> Thanks for your clarifications, Mark!
> >>
> >>
> >> Jay
> >>
> >>
> >> Mark Miller wrote:
> >>
> >>>> 5. Although currently IndexSearcher.close() does almost nothing except
> >>>> to close the internal index reader, it might be a safer to close
> >>>> searcher itself as well in closeCachedSearcher(), just in case, the
> >>>> searcher may have other resources to release in the future version of
> >>>> Lucene.
> >>>>
> >>> Didn't catch that "as well". You are right, great idea Jay, thanks.
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: [hidden email]
> >>> For additional commands, e-mail: [hidden email]
> >>>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [hidden email]
> >> For additional commands, e-mail: [hidden email]
> >>
> >>
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

vivek sar
In reply to this post by Mark Miller-3
Mark,

Thanks for the quick fix. Actually, it is possible that there might
had been simultaneous queries using the MultiSearcher. I assumed it
was thread-safe, thus was re-using the same instance. I'll update my
application code as well.

Thanks,
-vivek

On Feb 15, 2008 5:56 PM, Mark Miller <[hidden email]> wrote:

> Here is the fix: https://issues.apache.org/jira/browse/LUCENE-1026
>
>
> vivek sar wrote:
> > Mark,
> >
> >    There seems to be some issue with DefaultMultiIndexAccessor.java. I
> > got following NPE exception,
> >
> >      2008-02-13 07:10:28,021 ERROR [http-7501-Processor6] ReportServiceImpl -
> > java.lang.NullPointerException
> >         at org.apache.lucene.indexaccessor.DefaultMultiIndexAccessor.release(DefaultMultiIndexAccessor.java:89)
> >
> > Looks like the IndexAccessor for one of the Searcher in the
> > MultiSearcher returned null. Not sure how is that possible, any ideas
> > how is that possible?
> >
> > In my case it caused a critical error as the writer thread was stuck
> > forever (we found out after couple of days) because of this,
> >
> > "PS thread 9" prio=1 tid=0x00002aac70eb95d0 nid=0x6ba in Object.wait()
> > [0x0000000047533000..0x0000000047533b80]
> >         at java.lang.Object.wait(Native Method)
> >         - waiting on <0x00002aab3e5c7700> (a
> > org.apache.lucene.indexaccessor.DefaultIndexAccessor)
> >         at java.lang.Object.wait(Unknown Source)
> >         at org.apache.lucene.indexaccessor.DefaultIndexAccessor.waitForReadersAndCloseCached(DefaultIndexAccessor.java:593)
> >         at org.apache.lucene.indexaccessor.DefaultIndexAccessor.release(DefaultIndexAccessor.java:510)
> >         - locked <0x00002aab3e5c7700> (a
> > org.apache.lucene.indexaccessor.DefaultIndexAccessor)
> >
> > The only way to recover was to re-start the application.
> >
> > I use both MultiSearcher and IndexSearcher in my application, I've
> > looked at your code but not able to pinpoint how can it go wrong? Of
> > course, you do have to check for null in the
> > MultiIndexAccessor.release, but how could you get null index accessor
> > at first place?
> >
> > I do call IndexAccessor.close during partitioning of indexes, but the
> > close should wait for all Searchers to close before doing anything.
> >
> > Do you have any updates to your code since 02/04/2008?
> >
> > Thanks,
> > -vivek
> >
> > On Feb 6, 2008 8:37 AM, Jay <[hidden email]> wrote:
> >
> >> Thanks for your clarifications, Mark!
> >>
> >>
> >> Jay
> >>
> >>
> >> Mark Miller wrote:
> >>
> >>>> 5. Although currently IndexSearcher.close() does almost nothing except
> >>>> to close the internal index reader, it might be a safer to close
> >>>> searcher itself as well in closeCachedSearcher(), just in case, the
> >>>> searcher may have other resources to release in the future version of
> >>>> Lucene.
> >>>>
> >>> Didn't catch that "as well". You are right, great idea Jay, thanks.
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: [hidden email]
> >>> For additional commands, e-mail: [hidden email]
> >>>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [hidden email]
> >> For additional commands, e-mail: [hidden email]
> >>
> >>
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

vivek sar
Mark,

  We deployed our indexer (using defaultIndexAccessor) on one of the
production site and getting this error,

Caused by: java.util.concurrent.RejectedExecutionException
        at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown
Source)
        at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
        at org.apache.lucene.indexaccessor.DefaultIndexAccessor.release(DefaultIndexAccessor.java:514)


This is happening repeatedly every time the indexer runs.

This is running your latest IndexAccessor-021508 code.  Any ideas
(it's kind of urgent for us)?

Thanks,
-vivek


On Fri, Feb 15, 2008 at 6:50 PM, vivek sar <[hidden email]> wrote:

> Mark,
>
>  Thanks for the quick fix. Actually, it is possible that there might
>  had been simultaneous queries using the MultiSearcher. I assumed it
>  was thread-safe, thus was re-using the same instance. I'll update my
>  application code as well.
>
>  Thanks,
>  -vivek
>
>
>
>  On Feb 15, 2008 5:56 PM, Mark Miller <[hidden email]> wrote:
>  > Here is the fix: https://issues.apache.org/jira/browse/LUCENE-1026
>  >
>  >
>  > vivek sar wrote:
>  > > Mark,
>  > >
>  > >    There seems to be some issue with DefaultMultiIndexAccessor.java. I
>  > > got following NPE exception,
>  > >
>  > >      2008-02-13 07:10:28,021 ERROR [http-7501-Processor6] ReportServiceImpl -
>  > > java.lang.NullPointerException
>  > >         at org.apache.lucene.indexaccessor.DefaultMultiIndexAccessor.release(DefaultMultiIndexAccessor.java:89)
>  > >
>  > > Looks like the IndexAccessor for one of the Searcher in the
>  > > MultiSearcher returned null. Not sure how is that possible, any ideas
>  > > how is that possible?
>  > >
>  > > In my case it caused a critical error as the writer thread was stuck
>  > > forever (we found out after couple of days) because of this,
>  > >
>  > > "PS thread 9" prio=1 tid=0x00002aac70eb95d0 nid=0x6ba in Object.wait()
>  > > [0x0000000047533000..0x0000000047533b80]
>  > >         at java.lang.Object.wait(Native Method)
>  > >         - waiting on <0x00002aab3e5c7700> (a
>  > > org.apache.lucene.indexaccessor.DefaultIndexAccessor)
>  > >         at java.lang.Object.wait(Unknown Source)
>  > >         at org.apache.lucene.indexaccessor.DefaultIndexAccessor.waitForReadersAndCloseCached(DefaultIndexAccessor.java:593)
>  > >         at org.apache.lucene.indexaccessor.DefaultIndexAccessor.release(DefaultIndexAccessor.java:510)
>  > >         - locked <0x00002aab3e5c7700> (a
>  > > org.apache.lucene.indexaccessor.DefaultIndexAccessor)
>  > >
>  > > The only way to recover was to re-start the application.
>  > >
>  > > I use both MultiSearcher and IndexSearcher in my application, I've
>  > > looked at your code but not able to pinpoint how can it go wrong? Of
>  > > course, you do have to check for null in the
>  > > MultiIndexAccessor.release, but how could you get null index accessor
>  > > at first place?
>  > >
>  > > I do call IndexAccessor.close during partitioning of indexes, but the
>  > > close should wait for all Searchers to close before doing anything.
>  > >
>  > > Do you have any updates to your code since 02/04/2008?
>  > >
>  > > Thanks,
>  > > -vivek
>  > >
>  > > On Feb 6, 2008 8:37 AM, Jay <[hidden email]> wrote:
>  > >
>  > >> Thanks for your clarifications, Mark!
>  > >>
>  > >>
>  > >> Jay
>  > >>
>  > >>
>  > >> Mark Miller wrote:
>  > >>
>  > >>>> 5. Although currently IndexSearcher.close() does almost nothing except
>  > >>>> to close the internal index reader, it might be a safer to close
>  > >>>> searcher itself as well in closeCachedSearcher(), just in case, the
>  > >>>> searcher may have other resources to release in the future version of
>  > >>>> Lucene.
>  > >>>>
>  > >>> Didn't catch that "as well". You are right, great idea Jay, thanks.
>  > >>>
>  > >>> ---------------------------------------------------------------------
>  > >>> To unsubscribe, e-mail: [hidden email]
>  > >>> For additional commands, e-mail: [hidden email]
>  > >>>
>  > >> ---------------------------------------------------------------------
>  > >> To unsubscribe, e-mail: [hidden email]
>  > >> For additional commands, e-mail: [hidden email]
>  > >>
>  > >>
>  > >>
>  > >
>  > > ---------------------------------------------------------------------
>  > > To unsubscribe, e-mail: [hidden email]
>  > > For additional commands, e-mail: [hidden email]
>  > >
>  > >
>  > >
>  >
>  > ---------------------------------------------------------------------
>  > To unsubscribe, e-mail: [hidden email]
>  > For additional commands, e-mail: [hidden email]
>  >
>  >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

vivek sar
Mark,

 Some more information,

      1) I run indexwriter every 5 mins
      2) After every cycle I check if I need to partition (based on
the index size)
      3) In the partition interface,
            a)  I first call close on the index accessor (so all the
searchers can close before I move that index)
                          accessor =
IndexAccessorFactory.getInstance().getAccessor(dir.getFile());
                          accessor.close();
            b) Then I re-open the index accessor,
                           accessor = indexFactory.getAccessor(dir.getFile());
                           accessor.open();
            c) I optimized the my indexes using the Index Writer (that
I get from the accessor).
                           masterWriter = this.indexAccessor.getWriter(false);
                           masterWriter.optimize(optimizeSegment);
            d) Once the optimization is done I release the masterWriter,
                            this.indexAccessor.release(masterWriter);

         Now here is where I get the "RejectedExecutionException".
Reading up little more on this exception,
http://pveentjer.wordpress.com/2008/02/06/are-you-dealing-with-the-rejectedexecutionexception/,
I see this might be happening because something got stuck during the
close cycle, so the ExecutorSerivce is not accepting any new tasks.
I'm not sure how would this happen.

The critical problem is once I get this exception, every release call
throws the same exception (looks like shutdown never gets done).
Because of this my readers are never refreshed and I can not read any
new indexes.

May be I've to check whether the accessor is completely closed before
re-opening?  Could you in your release check whether the pool
(ExecutorService) is in shutdown state? Any thing else I can check?

Thanks,
-vivek

On Thu, Feb 28, 2008 at 1:26 PM, vivek sar <[hidden email]> wrote:

> Mark,
>
>   We deployed our indexer (using defaultIndexAccessor) on one of the
>  production site and getting this error,
>
>  Caused by: java.util.concurrent.RejectedExecutionException
>         at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown
>  Source)
>         at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
>         at org.apache.lucene.indexaccessor.DefaultIndexAccessor.release(DefaultIndexAccessor.java:514)
>
>
>  This is happening repeatedly every time the indexer runs.
>
>  This is running your latest IndexAccessor-021508 code.  Any ideas
>  (it's kind of urgent for us)?
>
>  Thanks,
>  -vivek
>
>
>
>
>  On Fri, Feb 15, 2008 at 6:50 PM, vivek sar <[hidden email]> wrote:
>  > Mark,
>  >
>  >  Thanks for the quick fix. Actually, it is possible that there might
>  >  had been simultaneous queries using the MultiSearcher. I assumed it
>  >  was thread-safe, thus was re-using the same instance. I'll update my
>  >  application code as well.
>  >
>  >  Thanks,
>  >  -vivek
>  >
>  >
>  >
>  >  On Feb 15, 2008 5:56 PM, Mark Miller <[hidden email]> wrote:
>  >  > Here is the fix: https://issues.apache.org/jira/browse/LUCENE-1026
>  >  >
>  >  >
>  >  > vivek sar wrote:
>  >  > > Mark,
>  >  > >
>  >  > >    There seems to be some issue with DefaultMultiIndexAccessor.java. I
>  >  > > got following NPE exception,
>  >  > >
>  >  > >      2008-02-13 07:10:28,021 ERROR [http-7501-Processor6] ReportServiceImpl -
>  >  > > java.lang.NullPointerException
>  >  > >         at org.apache.lucene.indexaccessor.DefaultMultiIndexAccessor.release(DefaultMultiIndexAccessor.java:89)
>  >  > >
>  >  > > Looks like the IndexAccessor for one of the Searcher in the
>  >  > > MultiSearcher returned null. Not sure how is that possible, any ideas
>  >  > > how is that possible?
>  >  > >
>  >  > > In my case it caused a critical error as the writer thread was stuck
>  >  > > forever (we found out after couple of days) because of this,
>  >  > >
>  >  > > "PS thread 9" prio=1 tid=0x00002aac70eb95d0 nid=0x6ba in Object.wait()
>  >  > > [0x0000000047533000..0x0000000047533b80]
>  >  > >         at java.lang.Object.wait(Native Method)
>  >  > >         - waiting on <0x00002aab3e5c7700> (a
>  >  > > org.apache.lucene.indexaccessor.DefaultIndexAccessor)
>  >  > >         at java.lang.Object.wait(Unknown Source)
>  >  > >         at org.apache.lucene.indexaccessor.DefaultIndexAccessor.waitForReadersAndCloseCached(DefaultIndexAccessor.java:593)
>  >  > >         at org.apache.lucene.indexaccessor.DefaultIndexAccessor.release(DefaultIndexAccessor.java:510)
>  >  > >         - locked <0x00002aab3e5c7700> (a
>  >  > > org.apache.lucene.indexaccessor.DefaultIndexAccessor)
>  >  > >
>  >  > > The only way to recover was to re-start the application.
>  >  > >
>  >  > > I use both MultiSearcher and IndexSearcher in my application, I've
>  >  > > looked at your code but not able to pinpoint how can it go wrong? Of
>  >  > > course, you do have to check for null in the
>  >  > > MultiIndexAccessor.release, but how could you get null index accessor
>  >  > > at first place?
>  >  > >
>  >  > > I do call IndexAccessor.close during partitioning of indexes, but the
>  >  > > close should wait for all Searchers to close before doing anything.
>  >  > >
>  >  > > Do you have any updates to your code since 02/04/2008?
>  >  > >
>  >  > > Thanks,
>  >  > > -vivek
>  >  > >
>  >  > > On Feb 6, 2008 8:37 AM, Jay <[hidden email]> wrote:
>  >  > >
>  >  > >> Thanks for your clarifications, Mark!
>  >  > >>
>  >  > >>
>  >  > >> Jay
>  >  > >>
>  >  > >>
>  >  > >> Mark Miller wrote:
>  >  > >>
>  >  > >>>> 5. Although currently IndexSearcher.close() does almost nothing except
>  >  > >>>> to close the internal index reader, it might be a safer to close
>  >  > >>>> searcher itself as well in closeCachedSearcher(), just in case, the
>  >  > >>>> searcher may have other resources to release in the future version of
>  >  > >>>> Lucene.
>  >  > >>>>
>  >  > >>> Didn't catch that "as well". You are right, great idea Jay, thanks.
>  >  > >>>
>  >  > >>> ---------------------------------------------------------------------
>  >  > >>> To unsubscribe, e-mail: [hidden email]
>  >  > >>> For additional commands, e-mail: [hidden email]
>  >  > >>>
>  >  > >> ---------------------------------------------------------------------
>  >  > >> To unsubscribe, e-mail: [hidden email]
>  >  > >> For additional commands, e-mail: [hidden email]
>  >  > >>
>  >  > >>
>  >  > >>
>  >  > >
>  >  > > ---------------------------------------------------------------------
>  >  > > To unsubscribe, e-mail: [hidden email]
>  >  > > For additional commands, e-mail: [hidden email]
>  >  > >
>  >  > >
>  >  > >
>  >  >
>  >  > ---------------------------------------------------------------------
>  >  > To unsubscribe, e-mail: [hidden email]
>  >  > For additional commands, e-mail: [hidden email]
>  >  >
>  >  >
>  >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DefaultIndexAccessor

Mark Miller-3
Hey vivek,

Sorry you ran into this. I believe the problem is that I had just not
foreseen the use case of closing and then reopening the Accessor. The
only time I ever close the Accessors is when I am shutting down the JVM.

What do you do about all of the IndexAccessor requests while it is in a
closed state? Could their be a better way of accomplishing this without
closing the Accessor? Would a new method that just stalled everything be
better? Then you wouldn't have to recreate any resources possibly?

In any case, the problem is that after the Executor gets shutdown it is
not reopened in the open method. I can certainly change this, but I need
to look for any other issues as well. I will add an open after a
shutdown test to investigate. I am going to think about the issue
further and I will get back to you soon.

Thanks for all of the details.

- Mark

vivek sar wrote:

> Mark,
>
>  Some more information,
>
>       1) I run indexwriter every 5 mins
>       2) After every cycle I check if I need to partition (based on
> the index size)
>       3) In the partition interface,
>             a)  I first call close on the index accessor (so all the
> searchers can close before I move that index)
>                           accessor =
> IndexAccessorFactory.getInstance().getAccessor(dir.getFile());
>                           accessor.close();
>             b) Then I re-open the index accessor,
>                            accessor = indexFactory.getAccessor(dir.getFile());
>                            accessor.open();
>             c) I optimized the my indexes using the Index Writer (that
> I get from the accessor).
>                            masterWriter = this.indexAccessor.getWriter(false);
>                            masterWriter.optimize(optimizeSegment);
>             d) Once the optimization is done I release the masterWriter,
>                             this.indexAccessor.release(masterWriter);
>
>          Now here is where I get the "RejectedExecutionException".
> Reading up little more on this exception,
> http://pveentjer.wordpress.com/2008/02/06/are-you-dealing-with-the-rejectedexecutionexception/,
> I see this might be happening because something got stuck during the
> close cycle, so the ExecutorSerivce is not accepting any new tasks.
> I'm not sure how would this happen.
>
> The critical problem is once I get this exception, every release call
> throws the same exception (looks like shutdown never gets done).
> Because of this my readers are never refreshed and I can not read any
> new indexes.
>
> May be I've to check whether the accessor is completely closed before
> re-opening?  Could you in your release check whether the pool
> (ExecutorService) is in shutdown state? Any thing else I can check?
>
> Thanks,
> -vivek
>
> On Thu, Feb 28, 2008 at 1:26 PM, vivek sar <[hidden email]> wrote:
>  
>> Mark,
>>
>>   We deployed our indexer (using defaultIndexAccessor) on one of the
>>  production site and getting this error,
>>
>>  Caused by: java.util.concurrent.RejectedExecutionException
>>         at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown
>>  Source)
>>         at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
>>         at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
>>         at org.apache.lucene.indexaccessor.DefaultIndexAccessor.release(DefaultIndexAccessor.java:514)
>>
>>
>>  This is happening repeatedly every time the indexer runs.
>>
>>  This is running your latest IndexAccessor-021508 code.  Any ideas
>>  (it's kind of urgent for us)?
>>
>>  Thanks,
>>  -vivek
>>
>>
>>
>>
>>  On Fri, Feb 15, 2008 at 6:50 PM, vivek sar <[hidden email]> wrote:
>>  > Mark,
>>  >
>>  >  Thanks for the quick fix. Actually, it is possible that there might
>>  >  had been simultaneous queries using the MultiSearcher. I assumed it
>>  >  was thread-safe, thus was re-using the same instance. I'll update my
>>  >  application code as well.
>>  >
>>  >  Thanks,
>>  >  -vivek
>>  >
>>  >
>>  >
>>  >  On Feb 15, 2008 5:56 PM, Mark Miller <[hidden email]> wrote:
>>  >  > Here is the fix: https://issues.apache.org/jira/browse/LUCENE-1026
>>  >  >
>>  >  >
>>  >  > vivek sar wrote:
>>  >  > > Mark,
>>  >  > >
>>  >  > >    There seems to be some issue with DefaultMultiIndexAccessor.java. I
>>  >  > > got following NPE exception,
>>  >  > >
>>  >  > >      2008-02-13 07:10:28,021 ERROR [http-7501-Processor6] ReportServiceImpl -
>>  >  > > java.lang.NullPointerException
>>  >  > >         at org.apache.lucene.indexaccessor.DefaultMultiIndexAccessor.release(DefaultMultiIndexAccessor.java:89)
>>  >  > >
>>  >  > > Looks like the IndexAccessor for one of the Searcher in the
>>  >  > > MultiSearcher returned null. Not sure how is that possible, any ideas
>>  >  > > how is that possible?
>>  >  > >
>>  >  > > In my case it caused a critical error as the writer thread was stuck
>>  >  > > forever (we found out after couple of days) because of this,
>>  >  > >
>>  >  > > "PS thread 9" prio=1 tid=0x00002aac70eb95d0 nid=0x6ba in Object.wait()
>>  >  > > [0x0000000047533000..0x0000000047533b80]
>>  >  > >         at java.lang.Object.wait(Native Method)
>>  >  > >         - waiting on <0x00002aab3e5c7700> (a
>>  >  > > org.apache.lucene.indexaccessor.DefaultIndexAccessor)
>>  >  > >         at java.lang.Object.wait(Unknown Source)
>>  >  > >         at org.apache.lucene.indexaccessor.DefaultIndexAccessor.waitForReadersAndCloseCached(DefaultIndexAccessor.java:593)
>>  >  > >         at org.apache.lucene.indexaccessor.DefaultIndexAccessor.release(DefaultIndexAccessor.java:510)
>>  >  > >         - locked <0x00002aab3e5c7700> (a
>>  >  > > org.apache.lucene.indexaccessor.DefaultIndexAccessor)
>>  >  > >
>>  >  > > The only way to recover was to re-start the application.
>>  >  > >
>>  >  > > I use both MultiSearcher and IndexSearcher in my application, I've
>>  >  > > looked at your code but not able to pinpoint how can it go wrong? Of
>>  >  > > course, you do have to check for null in the
>>  >  > > MultiIndexAccessor.release, but how could you get null index accessor
>>  >  > > at first place?
>>  >  > >
>>  >  > > I do call IndexAccessor.close during partitioning of indexes, but the
>>  >  > > close should wait for all Searchers to close before doing anything.
>>  >  > >
>>  >  > > Do you have any updates to your code since 02/04/2008?
>>  >  > >
>>  >  > > Thanks,
>>  >  > > -vivek
>>  >  > >
>>  >  > > On Feb 6, 2008 8:37 AM, Jay <[hidden email]> wrote:
>>  >  > >
>>  >  > >> Thanks for your clarifications, Mark!
>>  >  > >>
>>  >  > >>
>>  >  > >> Jay
>>  >  > >>
>>  >  > >>
>>  >  > >> Mark Miller wrote:
>>  >  > >>
>>  >  > >>>> 5. Although currently IndexSearcher.close() does almost nothing except
>>  >  > >>>> to close the internal index reader, it might be a safer to close
>>  >  > >>>> searcher itself as well in closeCachedSearcher(), just in case, the
>>  >  > >>>> searcher may have other resources to release in the future version of
>>  >  > >>>> Lucene.
>>  >  > >>>>
>>  >  > >>> Didn't catch that "as well". You are right, great idea Jay, thanks.
>>  >  > >>>
>>  >  > >>> ---------------------------------------------------------------------
>>  >  > >>> To unsubscribe, e-mail: [hidden email]
>>  >  > >>> For additional commands, e-mail: [hidden email]
>>  >  > >>>
>>  >  > >> ---------------------------------------------------------------------
>>  >  > >> To unsubscribe, e-mail: [hidden email]
>>  >  > >> For additional commands, e-mail: [hidden email]
>>  >  > >>
>>  >  > >>
>>  >  > >>
>>  >  > >
>>  >  > > ---------------------------------------------------------------------
>>  >  > > To unsubscribe, e-mail: [hidden email]
>>  >  > > For additional commands, e-mail: [hidden email]
>>  >  > >
>>  >  > >
>>  >  > >
>>  >  >
>>  >  > ---------------------------------------------------------------------
>>  >  > To unsubscribe, e-mail: [hidden email]
>>  >  > For additional commands, e-mail: [hidden email]
>>  >  >
>>  >  >
>>  >
>>
>>    
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
>  

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

12