Dedup results on the fly?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Dedup results on the fly?

Head
I would like to be able to tell SOLR to dedup the results based on a certain set of fields.   For example, I like to return only one instance of the set of documents that have the same 'name' and 'address'.   But I would still like to keep all instances around in case someone wants to retrieve one of the duplicate instances by ID.

Is there some way to do something like this... maybe with a custom Comparator???   Has anyone attempted to do this?
Reply | Threaded
Open this post in threaded view
|

Re: Dedup results on the fly?

Sean Timm
Take a look at https://issues.apache.org/jira/browse/SOLR-236 Field
Collapsing.

-Sean

Head wrote:
> I would like to be able to tell SOLR to dedup the results based on a certain
> set of fields.   For example, I like to return only one instance of the set
> of documents that have the same 'name' and 'address'.   But I would still
> like to keep all instances around in case someone wants to retrieve one of
> the duplicate instances by ID.
>
> Is there some way to do something like this... maybe with a custom
> Comparator???   Has anyone attempted to do this?
>  
Reply | Threaded
Open this post in threaded view
|

Re: Dedup results on the fly?

Alok Dhir
is this going to go into the 1.3 tree at some point?

On Feb 27, 2008, at 3:25 PM, Sean Timm wrote:

> Take a look at https://issues.apache.org/jira/browse/SOLR-236 Field  
> Collapsing.
>
> -Sean
>
> Head wrote:
>> I would like to be able to tell SOLR to dedup the results based on  
>> a certain
>> set of fields.   For example, I like to return only one instance of  
>> the set
>> of documents that have the same 'name' and 'address'.   But I would  
>> still
>> like to keep all instances around in case someone wants to retrieve  
>> one of
>> the duplicate instances by ID.
>>
>> Is there some way to do something like this... maybe with a custom
>> Comparator???   Has anyone attempted to do this?
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Dedup results on the fly?

Matthew Runo
I was going to ask the same thing, I'd support this in 1.3.

Thanks!

Matthew Runo
Software Developer
Zappos.com
702.943.7833

On Feb 27, 2008, at 12:29 PM, Alok Dhir wrote:

> is this going to go into the 1.3 tree at some point?
>
> On Feb 27, 2008, at 3:25 PM, Sean Timm wrote:
>
>> Take a look at https://issues.apache.org/jira/browse/SOLR-236 Field  
>> Collapsing.
>>
>> -Sean
>>
>> Head wrote:
>>> I would like to be able to tell SOLR to dedup the results based on  
>>> a certain
>>> set of fields.   For example, I like to return only one instance  
>>> of the set
>>> of documents that have the same 'name' and 'address'.   But I  
>>> would still
>>> like to keep all instances around in case someone wants to  
>>> retrieve one of
>>> the duplicate instances by ID.
>>>
>>> Is there some way to do something like this... maybe with a custom
>>> Comparator???   Has anyone attempted to do this?
>>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Dedup results on the fly?

Head
In reply to this post by Sean Timm
Thanks Sean!

Sean Timm wrote
Take a look at https://issues.apache.org/jira/browse/SOLR-236 Field
Collapsing.