[jira] Created: (SOLR-52) Lazy Field loading

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Re: [jira] Updated: (SOLR-52) Lazy Field loading

Yonik Seeley-2
On 11/15/06, Mike Klaas <[hidden email]> wrote:
> Any objections to sync'ing solr with lucene trunk?  It might be nice
> from an impact perspective to do so before lockless commits are
> committed.

+1, my thoughts as well.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: [jira] Updated: (SOLR-52) Lazy Field loading

Michael McCandless-2
Yonik Seeley wrote:
> On 11/15/06, Mike Klaas <[hidden email]> wrote:
>> Any objections to sync'ing solr with lucene trunk?  It might be nice
>> from an impact perspective to do so before lockless commits are
>> committed.
>
> +1, my thoughts as well.

Hi!  Should I hold off on committing lockless until you've done
a sync for Solr to the Lucene trunk?  Is this done now?

Mike
Reply | Threaded
Open this post in threaded view
|

Re: [jira] Updated: (SOLR-52) Lazy Field loading

Yonik Seeley-2
On 11/17/06, Michael McCandless <[hidden email]> wrote:
> Hi!  Should I hold off on committing lockless until you've done
> a sync for Solr to the Lucene trunk?  Is this done now?

We already grabbed a nightly build of Lucene.  Go for it!

-Yonik
Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-52) Lazy Field loading

Mihir Sharma (Jira)
In reply to this post by Mihir Sharma (Jira)
    [ http://issues.apache.org/jira/browse/SOLR-52?page=comments#action_12456164 ]
           
Hoss Man commented on SOLR-52:
------------------------------

Mike, how do you feel about these changes so far?

FWIW: if you still aren't comfortable resolving, you should probably mark this isssue as "In Progress" so it stays on the radar.

> Lazy Field loading
> ------------------
>
>                 Key: SOLR-52
>                 URL: http://issues.apache.org/jira/browse/SOLR-52
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>            Reporter: Mike Klaas
>         Assigned To: Mike Klaas
>            Priority: Minor
>         Attachments: lazyfields_patch.diff, lazyfields_patch.diff, lazyfields_patch.diff, lazyfields_patch.diff
>
>
> Add lazy field loading to solr.
> Currently solr reads all stored fields and filters the undesired fields based on the field list.  This is usually not a performance concern, but when using solr to store large numbers of fields, or just one large field (doc contents, eg. for highlighting), it is perceptible.
> Now, there is a concern with the doc cache of SolrIndexSearcher, which assumes it has the whole document in the cache.  To maintain this invariant, it is still the case that all the fields in a document are loaded in a searcher.doc(i) call.  However, if a field set is given to teh method, only the given fields are loaded directly, while the rest are loaded lazily.
> Some concerns about lazy field loading
>   1. Lazy field are only valid while the IndexReader is open.  I believe this is fine since the IndexReader is kept alive by the SolrIndexSearcher, so all docs in the cache have the reader available.  
>   2. It is slower to read a field lazily and retrieve its value later than retrieve it directory to begin with (though I don't know how much--depends on i/o factors).  We certainly don't want this to be the common case.  I added an optional call which accumulates all the field likely to be used in the request (highlighting, reponse writing), and populates the IndexSearcher cache a priori.  This has the added advantage of concentrating doc retrieval in a single place, which is nice from a performance testing perspective.
>  3. LazyFields are incompatible with the sundry Field declarations scattered about Solr.  I believe I've changed all the necessary locations to Fieldable.
> Comments appreciated

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-52) Lazy Field loading

Mihir Sharma (Jira)
In reply to this post by Mihir Sharma (Jira)
    [ http://issues.apache.org/jira/browse/SOLR-52?page=comments#action_12456173 ]
           
Mike Klaas commented on SOLR-52:
--------------------------------

Thanks for reminding me, Hoss.

Changes committed in r479793.  I will reopen if I discover problems.

> Lazy Field loading
> ------------------
>
>                 Key: SOLR-52
>                 URL: http://issues.apache.org/jira/browse/SOLR-52
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>            Reporter: Mike Klaas
>         Assigned To: Mike Klaas
>            Priority: Minor
>         Attachments: lazyfields_patch.diff, lazyfields_patch.diff, lazyfields_patch.diff, lazyfields_patch.diff
>
>
> Add lazy field loading to solr.
> Currently solr reads all stored fields and filters the undesired fields based on the field list.  This is usually not a performance concern, but when using solr to store large numbers of fields, or just one large field (doc contents, eg. for highlighting), it is perceptible.
> Now, there is a concern with the doc cache of SolrIndexSearcher, which assumes it has the whole document in the cache.  To maintain this invariant, it is still the case that all the fields in a document are loaded in a searcher.doc(i) call.  However, if a field set is given to teh method, only the given fields are loaded directly, while the rest are loaded lazily.
> Some concerns about lazy field loading
>   1. Lazy field are only valid while the IndexReader is open.  I believe this is fine since the IndexReader is kept alive by the SolrIndexSearcher, so all docs in the cache have the reader available.  
>   2. It is slower to read a field lazily and retrieve its value later than retrieve it directory to begin with (though I don't know how much--depends on i/o factors).  We certainly don't want this to be the common case.  I added an optional call which accumulates all the field likely to be used in the request (highlighting, reponse writing), and populates the IndexSearcher cache a priori.  This has the added advantage of concentrating doc retrieval in a single place, which is nice from a performance testing perspective.
>  3. LazyFields are incompatible with the sundry Field declarations scattered about Solr.  I believe I've changed all the necessary locations to Fieldable.
> Comments appreciated

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] Resolved: (SOLR-52) Lazy Field loading

Mihir Sharma (Jira)
In reply to this post by Mihir Sharma (Jira)
     [ http://issues.apache.org/jira/browse/SOLR-52?page=all ]

Mike Klaas resolved SOLR-52.
----------------------------

    Resolution: Fixed

committed in r479793

> Lazy Field loading
> ------------------
>
>                 Key: SOLR-52
>                 URL: http://issues.apache.org/jira/browse/SOLR-52
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>            Reporter: Mike Klaas
>         Assigned To: Mike Klaas
>            Priority: Minor
>         Attachments: lazyfields_patch.diff, lazyfields_patch.diff, lazyfields_patch.diff, lazyfields_patch.diff
>
>
> Add lazy field loading to solr.
> Currently solr reads all stored fields and filters the undesired fields based on the field list.  This is usually not a performance concern, but when using solr to store large numbers of fields, or just one large field (doc contents, eg. for highlighting), it is perceptible.
> Now, there is a concern with the doc cache of SolrIndexSearcher, which assumes it has the whole document in the cache.  To maintain this invariant, it is still the case that all the fields in a document are loaded in a searcher.doc(i) call.  However, if a field set is given to teh method, only the given fields are loaded directly, while the rest are loaded lazily.
> Some concerns about lazy field loading
>   1. Lazy field are only valid while the IndexReader is open.  I believe this is fine since the IndexReader is kept alive by the SolrIndexSearcher, so all docs in the cache have the reader available.  
>   2. It is slower to read a field lazily and retrieve its value later than retrieve it directory to begin with (though I don't know how much--depends on i/o factors).  We certainly don't want this to be the common case.  I added an optional call which accumulates all the field likely to be used in the request (highlighting, reponse writing), and populates the IndexSearcher cache a priori.  This has the added advantage of concentrating doc retrieval in a single place, which is nice from a performance testing perspective.
>  3. LazyFields are incompatible with the sundry Field declarations scattered about Solr.  I believe I've changed all the necessary locations to Fieldable.
> Comments appreciated

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
12