custom update handler questions

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

custom update handler questions

Erik Hatcher
What are the pros/cons to having a custom update handler that can  
accept just a unique id and data for specific fields of a document,  
such that all the fields from the existing document are picked up on  
the Solr side and updated with the ones sent by the client?

I'd like to rearrange the document design I'm using such that  
documents with large full-text fields can have certain small fields  
updated without having to send all the full-text across the wire.  Of  
course all fields would be stored to facilitate this.

Is this a big difficult deal, or reasonably implementable?

Thanks,
        Erik

Reply | Threaded
Open this post in threaded view
|

Re: custom update handler questions

Yonik Seeley-2
On 9/12/06, Erik Hatcher <[hidden email]> wrote:
> What are the pros/cons to having a custom update handler that can
> accept just a unique id and data for specific fields of a document,
> such that all the fields from the existing document are picked up on
> the Solr side and updated with the ones sent by the client?

First, update handlers aren't at the same level as query handlers.
Things were designed for custom query handlers, but aren't really
designed for custom update handlers... that's tricky stuff.

> I'd like to rearrange the document design I'm using such that
> documents with large full-text fields can have certain small fields
> updated without having to send all the full-text across the wire.  Of
> course all fields would be stored to facilitate this.
>
> Is this a big difficult deal, or reasonably implementable?

You could implement your own update handler and delegate to
DirectUpdateHandler2 for the work, but that has some difficulties if
you want high update rates.  The main problem being, what reader do
you get the original stored fields from?  If it's not a new reader,
you might be going back in time and grabbing old fields because the
doc was already updated.

So for good performance you would need to:
 - check the pending set to see if that doc has been updated, and if
so, open a new reader
 OR
 - Keep a reader open and buffer all your adds (this is the reverse
logic that DirectUpdateHandler2 employs by buffering deletes).

-Yonik