[jira] Commented: (LUCENE-2312) Search on IndexWriter's RAM Buffer

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2312) Search on IndexWriter's RAM Buffer

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891760#action_12891760 ]

Jason Rutherglen commented on LUCENE-2312:
------------------------------------------

This wikipedia article illustrates the use of parallel arrays we can use for the terms dictionary: http://en.wikipedia.org/wiki/Linked_list#Linked_lists_using_arrays_of_nodes

Where the next and previous arrays would be AtomicIntegerArrays for concurrency (we may not need to implement the previous array because the TermsEnum only goes forward).  If we can somehow implement the concurrent skip list on top of the concurrent linked list, we're probably good to go.

> Search on IndexWriter's RAM Buffer
> ----------------------------------
>
>                 Key: LUCENE-2312
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2312
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Search
>    Affects Versions: 3.0.1
>            Reporter: Jason Rutherglen
>            Assignee: Michael Busch
>             Fix For: Realtime Branch
>
>
> In order to offer user's near realtime search, without incurring
> an indexing performance penalty, we can implement search on
> IndexWriter's RAM buffer. This is the buffer that is filled in
> RAM as documents are indexed. Currently the RAM buffer is
> flushed to the underlying directory (usually disk) before being
> made searchable.
> Todays Lucene based NRT systems must incur the cost of merging
> segments, which can slow indexing.
> Michael Busch has good suggestions regarding how to handle deletes using max doc ids.  
> https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841923&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841923
> The area that isn't fully fleshed out is the terms dictionary,
> which needs to be sorted prior to queries executing. Currently
> IW implements a specialized hash table. Michael B has a
> suggestion here:
> https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841915&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841915

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]