Indexing Numeric value in Lucene 4.10.4

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Indexing Numeric value in Lucene 4.10.4

aravinth thangasami
Hi all,

I'm  searching numeric value and will not perform range query on that field
I thought of indexing it as String field instead of NumericField
so that it will improve indexing time by avoiding numeric tries

What are your opinions on this?


Kind regards,
Aravinth
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Indexing Numeric value in Lucene 4.10.4

Erick Erickson
bq: What are your opinions on this?

That this is not a sound approach. Why do you think Trie is expensive?
What evidence do you have at all for that? Strings are significantly
expensive relative to numeric fields. Plus, you can adjust the
precision step to reduce the "overhead" of a trie field.

I very strongly doubt that the index would be smaller with strings.
I'm certain comparisons would be slower. I really can't come up with
much of any reason why strings would be better.

Not to mention that sorting won't work unless you left-pad with zeros.

Best,
Erick

On Thu, Apr 6, 2017 at 6:32 AM, aravinth thangasami
<[hidden email]> wrote:

> Hi all,
>
> I'm  searching numeric value and will not perform range query on that field
> I thought of indexing it as String field instead of NumericField
> so that it will improve indexing time by avoiding numeric tries
>
> What are your opinions on this?
>
>
> Kind regards,
> Aravinth

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Indexing Numeric value in Lucene 4.10.4

aravinth thangasami
we don't have to sort on that field
So that we thought of that approach

Thanks for your opinion
will consider improving precision step

Kind regards,
Aravinth


On Thu, Apr 6, 2017 at 8:51 PM, Erick Erickson <[hidden email]>
wrote:

> bq: What are your opinions on this?
>
> That this is not a sound approach. Why do you think Trie is expensive?
> What evidence do you have at all for that? Strings are significantly
> expensive relative to numeric fields. Plus, you can adjust the
> precision step to reduce the "overhead" of a trie field.
>
> I very strongly doubt that the index would be smaller with strings.
> I'm certain comparisons would be slower. I really can't come up with
> much of any reason why strings would be better.
>
> Not to mention that sorting won't work unless you left-pad with zeros.
>
> Best,
> Erick
>
> On Thu, Apr 6, 2017 at 6:32 AM, aravinth thangasami
> <[hidden email]> wrote:
> > Hi all,
> >
> > I'm  searching numeric value and will not perform range query on that
> field
> > I thought of indexing it as String field instead of NumericField
> > so that it will improve indexing time by avoiding numeric tries
> >
> > What are your opinions on this?
> >
> >
> > Kind regards,
> > Aravinth
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Indexing Numeric value in Lucene 4.10.4

Uwe Schindler
Hi,

That's much easier. If you don't want to do ranges, use precisonStep=Integer.MAX_VALUE. If you want to sort, then an additional docvalues field is to be used, but that's not what you intend.

Uwe

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: [hidden email]

> -----Original Message-----
> From: aravinth thangasami [mailto:[hidden email]]
> Sent: Friday, April 7, 2017 8:54 AM
> To: [hidden email]
> Subject: Re: Indexing Numeric value in Lucene 4.10.4
>
> we don't have to sort on that field
> So that we thought of that approach
>
> Thanks for your opinion
> will consider improving precision step
>
> Kind regards,
> Aravinth
>
>
> On Thu, Apr 6, 2017 at 8:51 PM, Erick Erickson <[hidden email]>
> wrote:
>
> > bq: What are your opinions on this?
> >
> > That this is not a sound approach. Why do you think Trie is expensive?
> > What evidence do you have at all for that? Strings are significantly
> > expensive relative to numeric fields. Plus, you can adjust the
> > precision step to reduce the "overhead" of a trie field.
> >
> > I very strongly doubt that the index would be smaller with strings.
> > I'm certain comparisons would be slower. I really can't come up with
> > much of any reason why strings would be better.
> >
> > Not to mention that sorting won't work unless you left-pad with zeros.
> >
> > Best,
> > Erick
> >
> > On Thu, Apr 6, 2017 at 6:32 AM, aravinth thangasami
> > <[hidden email]> wrote:
> > > Hi all,
> > >
> > > I'm  searching numeric value and will not perform range query on that
> > field
> > > I thought of indexing it as String field instead of NumericField
> > > so that it will improve indexing time by avoiding numeric tries
> > >
> > > What are your opinions on this?
> > >
> > >
> > > Kind regards,
> > > Aravinth
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Indexing Numeric value in Lucene 4.10.4

aravinth thangasami
On Fri, 7 Apr 2017 at 1:44 PM, Uwe Schindler <[hidden email]> wrote:

> Hi,
>
> That's much easier. If you don't want to do ranges, use
> precisonStep=Integer.MAX_VALUE. If you want to sort, then an additional
> docvalues field is to be used, but that's not what you intend.
>
> Uwe
>
> -----
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> http://www.thetaphi.de
> eMail: [hidden email]
>
> > -----Original Message-----
> > From: aravinth thangasami [mailto:[hidden email]]
> > Sent: Friday, April 7, 2017 8:54 AM
> > To: [hidden email]
> > Subject: Re: Indexing Numeric value in Lucene 4.10.4
> >
> > we don't have to sort on that field
> > So that we thought of that approach
> >
> > Thanks for your opinion
> > will consider improving precision step
> >
> > Kind regards,
> > Aravinth
> >
> >
> > On Thu, Apr 6, 2017 at 8:51 PM, Erick Erickson <[hidden email]>
> > wrote:
> >
> > > bq: What are your opinions on this?
> > >
> > > That this is not a sound approach. Why do you think Trie is expensive?
> > > What evidence do you have at all for that? Strings are significantly
> > > expensive relative to numeric fields. Plus, you can adjust the
> > > precision step to reduce the "overhead" of a trie field.
> > >
> > > I very strongly doubt that the index would be smaller with strings.
> > > I'm certain comparisons would be slower. I really can't come up with
> > > much of any reason why strings would be better.
> > >
> > > Not to mention that sorting won't work unless you left-pad with zeros.
> > >
> > > Best,
> > > Erick
> > >
> > > On Thu, Apr 6, 2017 at 6:32 AM, aravinth thangasami
> > > <[hidden email]> wrote:
> > > > Hi all,
> > > >
> > > > I'm  searching numeric value and will not perform range query on that
> > > field
> > > > I thought of indexing it as String field instead of NumericField
> > > > so that it will improve indexing time by avoiding numeric tries
> > > >
> > > > What are your opinions on this?
> > > >
> > > >
> > > > Kind regards,
> > > > Aravinth
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [hidden email]
> > > For additional commands, e-mail: [hidden email]
> > >
> > >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
> Thanks Uwe
Loading...