search for a number within a range, where range values are mentioned in documents

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

search for a number within a range, where range values are mentioned in documents

Arunkumar Ayyavu
Hi!

I have a typical case where in an attribute (in a DB record) can
contain different ranges of numeric values. Let us say the range
values in this attribute for "record1" are
(20000-40000,5000-8000,45000-50000,454,231,1000). As you can see this
attribute can also contain isolated numeric values such as 454, 231
and 1000. Now, I want to return "record1" if the user searches for
20001 or 5003 or 231 or 50000. Right now, I'm exploding the range
values (within a transformer) and indexing "record1" for each of the
values within a range. But this could result in out-of-memory error if
the range is too large. Could you help me figure out a better way of
addressing this type of queries using Solr.

Thanks a ton.

--
Arun
Reply | Threaded
Open this post in threaded view
|

Re: search for a number within a range, where range values are mentioned in documents

Jonathan Rochkind
I'm not sure you're right that it will result in an out-of-memory error
if the range is too large. I don't think it will, I think it'll be fine
as far as memory goes, because of how Lucene works. Or do you actually
have reason to believe it was causing you memory issues?  Or do you just
mean memory issues in your "transformer", not actually in Solr?

Using Trie fields should also make it fine as far as CPU time goes.  
Using a trie int field with a non-zero "precision" should likely be
helpful in this case.

It _will_ increase the on-disk size of your indexes.

I'm not sure if there's a better approach, i can't think of one, but
maybe someone else knows one.

On 12/15/2010 12:56 PM, Arunkumar Ayyavu wrote:

> Hi!
>
> I have a typical case where in an attribute (in a DB record) can
> contain different ranges of numeric values. Let us say the range
> values in this attribute for "record1" are
> (20000-40000,5000-8000,45000-50000,454,231,1000). As you can see this
> attribute can also contain isolated numeric values such as 454, 231
> and 1000. Now, I want to return "record1" if the user searches for
> 20001 or 5003 or 231 or 50000. Right now, I'm exploding the range
> values (within a transformer) and indexing "record1" for each of the
> values within a range. But this could result in out-of-memory error if
> the range is too large. Could you help me figure out a better way of
> addressing this type of queries using Solr.
>
> Thanks a ton.
>
Reply | Threaded
Open this post in threaded view
|

Re: search for a number within a range, where range values are mentioned in documents

lee carroll
During data import can you update a record with min and max fields, these
would be equal in the case of a single non range value.

I know this is not a solr solution but a data pre-processing one but would
work?

Failing the above i've saw in the docs reference to a compound value field
(in the context of points, ie point = lat , lon which would be a nice way to
store your range fields anthough i still think you will need to pre-process
your data.

cheers lee

On 15 December 2010 18:22, Jonathan Rochkind <[hidden email]> wrote:

> I'm not sure you're right that it will result in an out-of-memory error if
> the range is too large. I don't think it will, I think it'll be fine as far
> as memory goes, because of how Lucene works. Or do you actually have reason
> to believe it was causing you memory issues?  Or do you just mean memory
> issues in your "transformer", not actually in Solr?
>
> Using Trie fields should also make it fine as far as CPU time goes.  Using
> a trie int field with a non-zero "precision" should likely be helpful in
> this case.
>
> It _will_ increase the on-disk size of your indexes.
>
> I'm not sure if there's a better approach, i can't think of one, but maybe
> someone else knows one.
>
>
> On 12/15/2010 12:56 PM, Arunkumar Ayyavu wrote:
>
>> Hi!
>>
>> I have a typical case where in an attribute (in a DB record) can
>> contain different ranges of numeric values. Let us say the range
>> values in this attribute for "record1" are
>> (20000-40000,5000-8000,45000-50000,454,231,1000). As you can see this
>> attribute can also contain isolated numeric values such as 454, 231
>> and 1000. Now, I want to return "record1" if the user searches for
>> 20001 or 5003 or 231 or 50000. Right now, I'm exploding the range
>> values (within a transformer) and indexing "record1" for each of the
>> values within a range. But this could result in out-of-memory error if
>> the range is too large. Could you help me figure out a better way of
>> addressing this type of queries using Solr.
>>
>> Thanks a ton.
>>
>>