Solr's use of Lucene's Compression field

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr's use of Lucene's Compression field

Grant Ingersoll-2
Thinking about http://lucene.markmail.org/message/mef4cdo7m3s6i3fc?q=background+merge+exception 
, it occurred to me that we probably should refactor Solr's offering  
of compression.  Currently, we rely on Field.COMPRESS from Lucene, but  
this really isn't considered best practice, see http://www.nabble.com/Need-Lucene-Compression-help----can-pay-nominal-fee-to11001907.html#a11013878 
, because it only offers the highest level of compression, which is  
also the slowest.

Obviously, Solr needs to handle the compression on the server side.  I  
think we should have Solr do the compression, allowing users to set  
the level of compression (maybe even make it pluggable to put in your  
own compression techniques) and then just use Lucene's binary field  
capability.  Granted, this is lower priority since I doubt many people  
use compression to begin with, but, still it would be useful.

-Grant
Reply | Threaded
Open this post in threaded view
|

Re: Solr's use of Lucene's Compression field

Mike Klaas
Agreed.  It was the simplest thing to do at the time, but it would  
definitely be preferrable to offer the much faster lesser levels of  
compression.

-Mike

On 3-Sep-08, at 8:57 AM, Grant Ingersoll wrote:

> Thinking about http://lucene.markmail.org/message/mef4cdo7m3s6i3fc?q=background+merge+exception 
> , it occurred to me that we probably should refactor Solr's offering  
> of compression.  Currently, we rely on Field.COMPRESS from Lucene,  
> but this really isn't considered best practice, see http://www.nabble.com/Need-Lucene-Compression-help----can-pay-nominal-fee-to11001907.html#a11013878 
> , because it only offers the highest level of compression, which is  
> also the slowest.
>
> Obviously, Solr needs to handle the compression on the server side.  
> I think we should have Solr do the compression, allowing users to  
> set the level of compression (maybe even make it pluggable to put in  
> your own compression techniques) and then just use Lucene's binary  
> field capability.  Granted, this is lower priority since I doubt  
> many people use compression to begin with, but, still it would be  
> useful.
>
> -Grant

Reply | Threaded
Open this post in threaded view
|

Re: Solr's use of Lucene's Compression field

Mike Klaas
Also I see that another Lucene bug (LUCENE-1374) was found relating to  
compressed fields in lucene (when we first added compressed field  
support to solr a lucene bug involving lazy-loaded fields and  
compression was uncovered, too).

It would be good to change the implementation simply to avoid relying  
on a deprecated lucene feature that isn't well exercised in development.

-Mike

On 3-Sep-08, at 11:36 AM, Mike Klaas wrote:

> Agreed.  It was the simplest thing to do at the time, but it would  
> definitely be preferrable to offer the much faster lesser levels of  
> compression.
>
> -Mike
>
> On 3-Sep-08, at 8:57 AM, Grant Ingersoll wrote:
>
>> Thinking about http://lucene.markmail.org/message/mef4cdo7m3s6i3fc?q=background+merge+exception 
>> , it occurred to me that we probably should refactor Solr's  
>> offering of compression.  Currently, we rely on Field.COMPRESS from  
>> Lucene, but this really isn't considered best practice, see http://www.nabble.com/Need-Lucene-Compression-help----can-pay-nominal-fee-to11001907.html#a11013878 
>> , because it only offers the highest level of compression, which is  
>> also the slowest.
>>
>> Obviously, Solr needs to handle the compression on the server  
>> side.  I think we should have Solr do the compression, allowing  
>> users to set the level of compression (maybe even make it pluggable  
>> to put in your own compression techniques) and then just use  
>> Lucene's binary field capability.  Granted, this is lower priority  
>> since I doubt many people use compression to begin with, but, still  
>> it would be useful.
>>
>> -Grant
>