Analysing Multivalued Fields

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Analysing Multivalued Fields

sidharth228
Hi,

Is there a way to analyze how multiple values in a multivalued field are
being tokenized and processed during indexing?

The "Analysis" page on the UI assumes that my multiple comma-separated
values is a single value. It filters out the comma and acts as if it's a
single value that I specified.

Thanks in advance!
Reply | Threaded
Open this post in threaded view
|

Re: Analysing Multivalued Fields

Erick Erickson
First, if you’re using primitive types, there is no analysis so in that case the question is irrelevant.

If you’re using a text-based field, the only difference between single-valued and multi-valued fields for analyzed types (i.e. text fields) is the offset recorded between entries. For instance:

Single value
<field.>this is some text</field>
position   token
0               this
1               is
2               some
3               text

Multi valued with positionIncrementGap=100
<field.>this is</field>
<field.>some text</field>

position   token
0               this
1               is
101           some
102           text

With a positionIncrementGap of 1, there’d be no difference. So if you’re using text-based fields, just do the values one at a time.

Or this is an XY problem, you’re trying to solve some problem. If the above is irrelevant, what is that problem you’re tying to solve?

Best,
Erick

> On Dec 31, 2019, at 1:32 AM, Sidharth Negi <[hidden email]> wrote:
>
> Hi,
>
> Is there a way to analyze how multiple values in a multivalued field are
> being tokenized and processed during indexing?
>
> The "Analysis" page on the UI assumes that my multiple comma-separated
> values is a single value. It filters out the comma and acts as if it's a
> single value that I specified.
>
> Thanks in advance!