Regarding behavior of docValues.

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Regarding behavior of docValues.

Modassar Ather-2
Hi,

Kindly help me understand the behavior of following field.

<field name="manu_exact" type="string" indexed="true" stored="false"
docValues="true" />

For a field like above where indexed="true" and docValues="true", is it
that:
 1) For sorting/faceting on *manu_exact* the docValues will be used.
 2) For querying on *manu_exact* the inverted index will be used.

Thanks,
Modassar
Reply | Threaded
Open this post in threaded view
|

Re: Regarding behavior of docValues.

Mikhail Khludnev
Both statements seem true to me.

On Tue, Feb 24, 2015 at 2:49 PM, Modassar Ather <[hidden email]>
wrote:

> Hi,
>
> Kindly help me understand the behavior of following field.
>
> <field name="manu_exact" type="string" indexed="true" stored="false"
> docValues="true" />
>
> For a field like above where indexed="true" and docValues="true", is it
> that:
>  1) For sorting/faceting on *manu_exact* the docValues will be used.
>  2) For querying on *manu_exact* the inverted index will be used.
>
> Thanks,
> Modassar
>



--
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
<[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: Regarding behavior of docValues.

Modassar Ather-2
Thanks for your response Mikhail.

On Tue, Feb 24, 2015 at 5:35 PM, Mikhail Khludnev <
[hidden email]> wrote:

> Both statements seem true to me.
>
> On Tue, Feb 24, 2015 at 2:49 PM, Modassar Ather <[hidden email]>
> wrote:
>
> > Hi,
> >
> > Kindly help me understand the behavior of following field.
> >
> > <field name="manu_exact" type="string" indexed="true" stored="false"
> > docValues="true" />
> >
> > For a field like above where indexed="true" and docValues="true", is it
> > that:
> >  1) For sorting/faceting on *manu_exact* the docValues will be used.
> >  2) For querying on *manu_exact* the inverted index will be used.
> >
> > Thanks,
> > Modassar
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> <http://www.griddynamics.com>
> <[hidden email]>
>
Reply | Threaded
Open this post in threaded view
|

Re: Regarding behavior of docValues.

Erick Erickson
Hmmm, that's not my understanding. docValues are simply a different
layout for storing
the _indexed_ values that facilitates rapid loading of the field from
disk, essentially
putting the uninverted field value in a conveniently-loadable form.

So AFAIK, the field is stored only once and used for all three,
sorting, faceting and
searching.

Best,
Erick

On Tue, Feb 24, 2015 at 4:13 AM, Modassar Ather <[hidden email]> wrote:

> Thanks for your response Mikhail.
>
> On Tue, Feb 24, 2015 at 5:35 PM, Mikhail Khludnev <
> [hidden email]> wrote:
>
>> Both statements seem true to me.
>>
>> On Tue, Feb 24, 2015 at 2:49 PM, Modassar Ather <[hidden email]>
>> wrote:
>>
>> > Hi,
>> >
>> > Kindly help me understand the behavior of following field.
>> >
>> > <field name="manu_exact" type="string" indexed="true" stored="false"
>> > docValues="true" />
>> >
>> > For a field like above where indexed="true" and docValues="true", is it
>> > that:
>> >  1) For sorting/faceting on *manu_exact* the docValues will be used.
>> >  2) For querying on *manu_exact* the inverted index will be used.
>> >
>> > Thanks,
>> > Modassar
>> >
>>
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>> Principal Engineer,
>> Grid Dynamics
>>
>> <http://www.griddynamics.com>
>> <[hidden email]>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Regarding behavior of docValues.

Modassar Ather-2
So for a requirement where I have a field which is used for sorting,
faceting and searching what should be the better field definition.

Can it be *<field name="manu_exact" type="string" indexed="true"
stored="false" docValues="true" />*
or
Two fields each for sorting+faceting and for searching like following.


*<field name="manu_exact" type="string" indexed="true" stored="false"
/><field name="manu_exact_sort" type="string" indexed="false"
stored="false" docValues="true" />*

Kindly note that it will be better if can use existing field for sorting,
faceting and add searching on it like in example one above.

Regards,
Modassar

On Tue, Feb 24, 2015 at 11:15 PM, Erick Erickson <[hidden email]>
wrote:

> Hmmm, that's not my understanding. docValues are simply a different
> layout for storing
> the _indexed_ values that facilitates rapid loading of the field from
> disk, essentially
> putting the uninverted field value in a conveniently-loadable form.
>
> So AFAIK, the field is stored only once and used for all three,
> sorting, faceting and
> searching.
>
> Best,
> Erick
>
> On Tue, Feb 24, 2015 at 4:13 AM, Modassar Ather <[hidden email]>
> wrote:
> > Thanks for your response Mikhail.
> >
> > On Tue, Feb 24, 2015 at 5:35 PM, Mikhail Khludnev <
> > [hidden email]> wrote:
> >
> >> Both statements seem true to me.
> >>
> >> On Tue, Feb 24, 2015 at 2:49 PM, Modassar Ather <[hidden email]
> >
> >> wrote:
> >>
> >> > Hi,
> >> >
> >> > Kindly help me understand the behavior of following field.
> >> >
> >> > <field name="manu_exact" type="string" indexed="true" stored="false"
> >> > docValues="true" />
> >> >
> >> > For a field like above where indexed="true" and docValues="true", is
> it
> >> > that:
> >> >  1) For sorting/faceting on *manu_exact* the docValues will be used.
> >> >  2) For querying on *manu_exact* the inverted index will be used.
> >> >
> >> > Thanks,
> >> > Modassar
> >> >
> >>
> >>
> >>
> >> --
> >> Sincerely yours
> >> Mikhail Khludnev
> >> Principal Engineer,
> >> Grid Dynamics
> >>
> >> <http://www.griddynamics.com>
> >> <[hidden email]>
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: Regarding behavior of docValues.

Erick Erickson
You're making it too complicated. Both a docValues field and
an indexed (not docValues) field will give you the same
functionality. For rapidly changing indexes, docValues will
load more quickly when a new searcher is opened.

Your question below is not really relevant.
****
Can it be *<field name="manu_exact" type="string" indexed="true"
stored="false" docValues="true" />*
or
Two fields each for sorting+faceting and for searching like following.


*<field name="manu_exact" type="string" indexed="true" stored="false"
/><field name="manu_exact_sort" type="string" indexed="false"
stored="false" docValues="true" />*

*****
You simply cannot sort, search, or facet on any field for which
indexed="false". You can do all three on any field where
indexed="true" (assuming it's not multiValued and only has one token
since sorting only really makes sense for single-valued fields).

It doesn't matter whether the field is docValues="true" or not.
So if you want a "rule of thumb", make it a docValues field
if you're updating your index rapidly. Otherwise whether a field is
docValues or not is largely irrelevant.

Best,
Erick

On Tue, Feb 24, 2015 at 9:09 PM, Modassar Ather <[hidden email]> wrote:

> So for a requirement where I have a field which is used for sorting,
> faceting and searching what should be the better field definition.
>
> Can it be *<field name="manu_exact" type="string" indexed="true"
> stored="false" docValues="true" />*
> or
> Two fields each for sorting+faceting and for searching like following.
>
>
> *<field name="manu_exact" type="string" indexed="true" stored="false"
> /><field name="manu_exact_sort" type="string" indexed="false"
> stored="false" docValues="true" />*
>
> Kindly note that it will be better if can use existing field for sorting,
> faceting and add searching on it like in example one above.
>
> Regards,
> Modassar
>
> On Tue, Feb 24, 2015 at 11:15 PM, Erick Erickson <[hidden email]>
> wrote:
>
>> Hmmm, that's not my understanding. docValues are simply a different
>> layout for storing
>> the _indexed_ values that facilitates rapid loading of the field from
>> disk, essentially
>> putting the uninverted field value in a conveniently-loadable form.
>>
>> So AFAIK, the field is stored only once and used for all three,
>> sorting, faceting and
>> searching.
>>
>> Best,
>> Erick
>>
>> On Tue, Feb 24, 2015 at 4:13 AM, Modassar Ather <[hidden email]>
>> wrote:
>> > Thanks for your response Mikhail.
>> >
>> > On Tue, Feb 24, 2015 at 5:35 PM, Mikhail Khludnev <
>> > [hidden email]> wrote:
>> >
>> >> Both statements seem true to me.
>> >>
>> >> On Tue, Feb 24, 2015 at 2:49 PM, Modassar Ather <[hidden email]
>> >
>> >> wrote:
>> >>
>> >> > Hi,
>> >> >
>> >> > Kindly help me understand the behavior of following field.
>> >> >
>> >> > <field name="manu_exact" type="string" indexed="true" stored="false"
>> >> > docValues="true" />
>> >> >
>> >> > For a field like above where indexed="true" and docValues="true", is
>> it
>> >> > that:
>> >> >  1) For sorting/faceting on *manu_exact* the docValues will be used.
>> >> >  2) For querying on *manu_exact* the inverted index will be used.
>> >> >
>> >> > Thanks,
>> >> > Modassar
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Sincerely yours
>> >> Mikhail Khludnev
>> >> Principal Engineer,
>> >> Grid Dynamics
>> >>
>> >> <http://www.griddynamics.com>
>> >> <[hidden email]>
>> >>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Regarding behavior of docValues.

Modassar Ather-2
Thanks Erick for your detailed response.

Sorry! I missed to put that I was trying to understand it in context of
Solr-5.0.0 where fieldcache is no more available.

Regards,
Modassar

On Wed, Feb 25, 2015 at 11:26 AM, Erick Erickson <[hidden email]>
wrote:

> You're making it too complicated. Both a docValues field and
> an indexed (not docValues) field will give you the same
> functionality. For rapidly changing indexes, docValues will
> load more quickly when a new searcher is opened.
>
> Your question below is not really relevant.
> ****
> Can it be *<field name="manu_exact" type="string" indexed="true"
> stored="false" docValues="true" />*
> or
> Two fields each for sorting+faceting and for searching like following.
>
>
> *<field name="manu_exact" type="string" indexed="true" stored="false"
> /><field name="manu_exact_sort" type="string" indexed="false"
> stored="false" docValues="true" />*
>
> *****
> You simply cannot sort, search, or facet on any field for which
> indexed="false". You can do all three on any field where
> indexed="true" (assuming it's not multiValued and only has one token
> since sorting only really makes sense for single-valued fields).
>
> It doesn't matter whether the field is docValues="true" or not.
> So if you want a "rule of thumb", make it a docValues field
> if you're updating your index rapidly. Otherwise whether a field is
> docValues or not is largely irrelevant.
>
> Best,
> Erick
>
> On Tue, Feb 24, 2015 at 9:09 PM, Modassar Ather <[hidden email]>
> wrote:
> > So for a requirement where I have a field which is used for sorting,
> > faceting and searching what should be the better field definition.
> >
> > Can it be *<field name="manu_exact" type="string" indexed="true"
> > stored="false" docValues="true" />*
> > or
> > Two fields each for sorting+faceting and for searching like following.
> >
> >
> > *<field name="manu_exact" type="string" indexed="true" stored="false"
> > /><field name="manu_exact_sort" type="string" indexed="false"
> > stored="false" docValues="true" />*
> >
> > Kindly note that it will be better if can use existing field for sorting,
> > faceting and add searching on it like in example one above.
> >
> > Regards,
> > Modassar
> >
> > On Tue, Feb 24, 2015 at 11:15 PM, Erick Erickson <
> [hidden email]>
> > wrote:
> >
> >> Hmmm, that's not my understanding. docValues are simply a different
> >> layout for storing
> >> the _indexed_ values that facilitates rapid loading of the field from
> >> disk, essentially
> >> putting the uninverted field value in a conveniently-loadable form.
> >>
> >> So AFAIK, the field is stored only once and used for all three,
> >> sorting, faceting and
> >> searching.
> >>
> >> Best,
> >> Erick
> >>
> >> On Tue, Feb 24, 2015 at 4:13 AM, Modassar Ather <[hidden email]
> >
> >> wrote:
> >> > Thanks for your response Mikhail.
> >> >
> >> > On Tue, Feb 24, 2015 at 5:35 PM, Mikhail Khludnev <
> >> > [hidden email]> wrote:
> >> >
> >> >> Both statements seem true to me.
> >> >>
> >> >> On Tue, Feb 24, 2015 at 2:49 PM, Modassar Ather <
> [hidden email]
> >> >
> >> >> wrote:
> >> >>
> >> >> > Hi,
> >> >> >
> >> >> > Kindly help me understand the behavior of following field.
> >> >> >
> >> >> > <field name="manu_exact" type="string" indexed="true"
> stored="false"
> >> >> > docValues="true" />
> >> >> >
> >> >> > For a field like above where indexed="true" and docValues="true",
> is
> >> it
> >> >> > that:
> >> >> >  1) For sorting/faceting on *manu_exact* the docValues will be
> used.
> >> >> >  2) For querying on *manu_exact* the inverted index will be used.
> >> >> >
> >> >> > Thanks,
> >> >> > Modassar
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Sincerely yours
> >> >> Mikhail Khludnev
> >> >> Principal Engineer,
> >> >> Grid Dynamics
> >> >>
> >> >> <http://www.griddynamics.com>
> >> >> <[hidden email]>
> >> >>
> >>
>