Re: DO NOT REPLY [Bug 34995] - Contribution: LuceneIndexAccessor

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: DO NOT REPLY [Bug 34995] - Contribution: LuceneIndexAccessor

Otis Gospodnetic-2
I didn't follow this closely, but are you saying that
LuceneIndexAccessor then replaces IOError caused by locking with
blocking calls?  It sounds like the client of LuceneIndexAccessor still
needs to keep track of open IndexReaders, IndexWriters, etc., or else
one can end up with a hard-to-track blocked call somewhere in the code,
no?  It would be nice to see how this works via a unit test.

Thanks for the contribution!

Otis


> http://issues.apache.org/bugzilla/show_bug.cgi?id=34995
>
> ------- Additional Comments From
> [hidden email]  2005-05-22 14:52 -------
> So having said that, the component's purpose can be summarized as
> enforcing the
> IndexReader/IndexWriter/Searcher/Directory usage matrix (you cannot
> delete while
> adding documents etc.)


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DO NOT REPLY [Bug 34995] - Contribution: LuceneIndexAccessor

Maik Schreiber
> I didn't follow this closely, but are you saying that
> LuceneIndexAccessor then replaces IOError caused by locking with
> blocking calls?  It sounds like the client of LuceneIndexAccessor still
> needs to keep track of open IndexReaders, IndexWriters, etc., or else
> one can end up with a hard-to-track blocked call somewhere in the code,
> no?  It would be nice to see how this works via a unit test.

Please see http://issues.apache.org/bugzilla/show_bug.cgi?id=34995#c3

The client just does getWriter(), getReader() or whatever it wishes, then
uses release() to give the instances back. The index accessor is responsible
for synchronizing access such that no two writers can be open at the same
time etc.

--
Maik Schreiber   *   http://www.blizzy.de

GPG public key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1F11D713
Key fingerprint: CF19 AFCE 6E3D 5443 9599 18B5 5640 1F11 D713

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DO NOT REPLY [Bug 34995] - Contribution: LuceneIndexAccessor

Otis Gospodnetic-2
In reply to this post by Otis Gospodnetic-2
Hi Maik,

So what happens in this case:

IndexAccessProvider accessProvider = new IndexAccessProvider(directory,
analyzer);
LuceneIndexAccessor accessor = new LuceneIndexAccessor(accessProvider);

accessor.open();
               
IndexWriter writer = accessor.getWriter();
// reference to the same instance?
IndexWriter writer2 = accessor.getWriter();
writer.addDocument(....);
writer2.addDocument(....);

// I didn't release the writer yet
// will this block?
IndexReader reader = accessor.getReader();
reader.delete(....);


It looks like you answered that in Bugzilla already.  But doesn't that
make management of IndexWriter/IndexReader as "involved" as doing it
directly?  Is the idea that this shields the user from needing to do
his own synchronization?  Actually, I can see how working with this
code can be simpler than dealing with details yourself.

Is there a way to time out the blocking getWriter/Reader()?

To deal with managing index-modifying access to the index I often use
code that acts as a facade to IndexReader/Writer and provides methods
such as index(....), optimize(), and delete(....).  All of these
methods have index-modifying code inside a "synchronized
(_myDirInstanceHere)" block.  And that seems to work without the need
to open, close, and release explicitly.
I guess such code doesn't give you direct access to IndexReader/Writer,
so that's the drawback.... and I guess the synchronization really
blocking as well. :)

Otis


--- Maik Schreiber <[hidden email]> wrote:

> > I didn't follow this closely, but are you saying that
> > LuceneIndexAccessor then replaces IOError caused by locking with
> > blocking calls?  It sounds like the client of LuceneIndexAccessor
> still
> > needs to keep track of open IndexReaders, IndexWriters, etc., or
> else
> > one can end up with a hard-to-track blocked call somewhere in the
> code,
> > no?  It would be nice to see how this works via a unit test.
>
> Please see http://issues.apache.org/bugzilla/show_bug.cgi?id=34995#c3
>
> The client just does getWriter(), getReader() or whatever it wishes,
> then
> uses release() to give the instances back. The index accessor is
> responsible
> for synchronizing access such that no two writers can be open at the
> same
> time etc.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DO NOT REPLY [Bug 34995] - Contribution: LuceneIndexAccessor

Maik Schreiber
> IndexWriter writer = accessor.getWriter();
> // reference to the same instance?
> IndexWriter writer2 = accessor.getWriter();
> writer.addDocument(....);
> writer2.addDocument(....);

Yes, regardless of which thread invokes getWriter(). This means multiple
threads are concurrently able to add new documents.

> // I didn't release the writer yet
> // will this block?
> IndexReader reader = accessor.getReader();
> reader.delete(....);

If you've invoked getReader(true), then it will block. With getReader(false)
it won't block, but the Reader should be used for write access in that case.

> But doesn't that
> make management of IndexWriter/IndexReader as "involved" as doing it
> directly?  Is the idea that this shields the user from needing to do
> his own synchronization?  Actually, I can see how working with this
> code can be simpler than dealing with details yourself.

You've just answered that yourself. LuceneIndexAccessor removes your burden
to synchronize index access between multiple threads. Granted, in a
single-threaded environment there might not be much use for it.

> Is there a way to time out the blocking getWriter/Reader()?

Not yet, but the code can be easily modified.

> I guess such code doesn't give you direct access to IndexReader/Writer,
> so that's the drawback.... and I guess the synchronization really
> blocking as well. :)

Yes, I think your approach is more prone to contention. LuceneIndexAccessor
tries to avoid contention where possible.

--
Maik Schreiber   *   http://www.blizzy.de

GPG public key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1F11D713
Key fingerprint: CF19 AFCE 6E3D 5443 9599 18B5 5640 1F11 D713

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DO NOT REPLY [Bug 34995] - Contribution: LuceneIndexAccessor

Daniel Naber
On Sunday 22 May 2005 21:01, Maik Schreiber wrote:

> Yes, regardless of which thread invokes getWriter(). This means multiple
> threads are concurrently able to add new documents.

Isn't t that already possible without any accessor class (you need to use
the same IndexWriter for all your threads)?

Regards
 Daniel

--
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DO NOT REPLY [Bug 34995] - Contribution: LuceneIndexAccessor

Maik Schreiber
> Isn't t that already possible without any accessor class (you need to use
> the same IndexWriter for all your threads)?

Yes, but you also need to keep track of who's using the writer before you
can close it. Additionally, closing a writer yourself doesn't make sure that
cached readers and searchers also get closed. This is a key functionality of
LuceneIndexAccessor - it caches readers, searchers and writers as long as
possible, but closes and reopens them when necessary.

--
Maik Schreiber   *   http://www.blizzy.de

GPG public key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1F11D713
Key fingerprint: CF19 AFCE 6E3D 5443 9599 18B5 5640 1F11 D713

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DO NOT REPLY [Bug 34995] - Contribution: LuceneIndexAccessor

Daniel Naber
In reply to this post by Otis Gospodnetic-2
On Sunday 22 May 2005 20:17, Otis Gospodnetic wrote:

> To deal with managing index-modifying access to the index I often use
> code that acts as a facade to IndexReader/Writer and provides methods
> such as index(....), optimize(), and delete(....).  All of these
> methods have index-modifying code inside a "synchronized
> (_myDirInstanceHere)" block.

That sounds quite useful. Any chance you can contribute this to Lucene?

Regards
 Daniel

--
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: DO NOT REPLY [Bug 34995] - Contribution: LuceneIndexAccessor

Otis Gospodnetic-2
In reply to this post by Otis Gospodnetic-2
What I described is really nothing more than a few methods like this:

    public void index(Indexable data)
        throws IOException
    {
        synchronized(_directory)
        {
            IndexWriter writer = getFSWriter();
            try
            {
                Document doc = createDocument(
                    data.getUnStoredFields(),
                    data.getTextFields(),
                    data.getKeywordFields());
                writer.addDocument(doc);
            }
            finally
            {
                writer.close();
            }
        }
    }

Maik's contribution sounds better, and I'll likely adopt it in my
projects.

Otis


--- Daniel Naber <[hidden email]> wrote:

> On Sunday 22 May 2005 20:17, Otis Gospodnetic wrote:
>
> > To deal with managing index-modifying access to the index I often
> use
> > code that acts as a facade to IndexReader/Writer and provides
> methods
> > such as index(....), optimize(), and delete(....).  All of these
> > methods have index-modifying code inside a "synchronized
> > (_myDirInstanceHere)" block.
>
> That sounds quite useful. Any chance you can contribute this to
> Lucene?
>
> Regards
>  Daniel
>
> --
> http://www.danielnaber.de
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]