solr filter query on text field

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

solr filter query on text field

weiwang19
Hi,

I am running filter query on a field of text_general type and see
completely different results for the following queries:

   fq= my_text_field:"Jurassic park the movie"               returns 0
result

   fq= my_text_field:(Jurassic park the movie)               returns 20
result

   fq= my_text_field:Jurassic park the movie                  returns
thousands of results


Which one is the correct syntax? I am confused why the first query doesn't
have any match at all.  I also thought 2 and 3 are the same, but turns out
quite different.


Thanks,
Wei
Reply | Threaded
Open this post in threaded view
|

Re: solr filter query on text field

Erick Erickson
1> is looking for the _phrase_, so the four tokens "jurassic" "park"
"the" "movie" have to appear next to each other in that order.

2> is looking for the four tokens anywhere in the field. Whether they
_all_ must appear depends on whether the default operator (OR or AND).

3> is parsed as my_text_field:Jurassic default_text_field:pard
default_text_field:the default_text_field:movie.

Adding &debug=query to your query will show you what the parsed query
looks like and help answer these kinds of questions.

Best,
Erick

On Wed, Jul 11, 2018 at 8:54 AM, Wei <[hidden email]> wrote:

> Hi,
>
> I am running filter query on a field of text_general type and see
> completely different results for the following queries:
>
>    fq= my_text_field:"Jurassic park the movie"               returns 0
> result
>
>    fq= my_text_field:(Jurassic park the movie)               returns 20
> result
>
>    fq= my_text_field:Jurassic park the movie                  returns
> thousands of results
>
>
> Which one is the correct syntax? I am confused why the first query doesn't
> have any match at all.  I also thought 2 and 3 are the same, but turns out
> quite different.
>
>
> Thanks,
> Wei
Reply | Threaded
Open this post in threaded view
|

Re: solr filter query on text field

Andrea Gazzarini-6
In reply to this post by weiwang19
The syntax is valid in all those three examples, the right one depends on
what you need.

The first query executes a proximity search (you can think to a phrase
search, for simplicity) so it returns no result because probably you don't
have any matching docs with that whole literal.

The second is querying the my_text_field for all terms which compose the
value between parenthesis. You can think to a query where each term is an
optional clause, something like mytextfield:jurassic OR mytextfiekd:park...
(it's not exactly an OR but this could give you the idea=

The third example is not doing what you think. My_text_field is used only
with the first term (Jurassic) while the others are using the default
field. Something like mytextfield:jurassic OR defaultfield:park OR
defaultfield:the.... That's the reason  you have so many results (I guess
the default field is a catch-all field)

Sorry for typos I'm using my mobile

Andrea

Il mer 11 lug 2018, 17:54 Wei <[hidden email]> ha scritto:

> Hi,
>
> I am running filter query on a field of text_general type and see
> completely different results for the following queries:
>
>    fq= my_text_field:"Jurassic park the movie"               returns 0
> result
>
>    fq= my_text_field:(Jurassic park the movie)               returns 20
> result
>
>    fq= my_text_field:Jurassic park the movie                  returns
> thousands of results
>
>
> Which one is the correct syntax? I am confused why the first query doesn't
> have any match at all.  I also thought 2 and 3 are the same, but turns out
> quite different.
>
>
> Thanks,
> Wei
>
Reply | Threaded
Open this post in threaded view
|

Re: solr filter query on text field

weiwang19
Thanks Erick and Andrea!  If my default operator is OR,  fq=
my_text_field:(Jurassic park the movie)  is equivalent to
my_text_field:(Jurassic
OR park OR the OR movie)? That make sense.

On Wed, Jul 11, 2018 at 9:06 AM, Andrea Gazzarini <[hidden email]>
wrote:

> The syntax is valid in all those three examples, the right one depends on
> what you need.
>
> The first query executes a proximity search (you can think to a phrase
> search, for simplicity) so it returns no result because probably you don't
> have any matching docs with that whole literal.
>
> The second is querying the my_text_field for all terms which compose the
> value between parenthesis. You can think to a query where each term is an
> optional clause, something like mytextfield:jurassic OR mytextfiekd:park...
> (it's not exactly an OR but this could give you the idea=
>
> The third example is not doing what you think. My_text_field is used only
> with the first term (Jurassic) while the others are using the default
> field. Something like mytextfield:jurassic OR defaultfield:park OR
> defaultfield:the.... That's the reason  you have so many results (I guess
> the default field is a catch-all field)
>
> Sorry for typos I'm using my mobile
>
> Andrea
>
> Il mer 11 lug 2018, 17:54 Wei <[hidden email]> ha scritto:
>
> > Hi,
> >
> > I am running filter query on a field of text_general type and see
> > completely different results for the following queries:
> >
> >    fq= my_text_field:"Jurassic park the movie"               returns 0
> > result
> >
> >    fq= my_text_field:(Jurassic park the movie)               returns 20
> > result
> >
> >    fq= my_text_field:Jurassic park the movie                  returns
> > thousands of results
> >
> >
> > Which one is the correct syntax? I am confused why the first query
> doesn't
> > have any match at all.  I also thought 2 and 3 are the same, but turns
> out
> > quite different.
> >
> >
> > Thanks,
> > Wei
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: solr filter query on text field

weiwang19
btw, is there any difference if the fq field is a string field vs test
field?

On Wed, Jul 11, 2018 at 11:59 AM, Wei <[hidden email]> wrote:

> Thanks Erick and Andrea!  If my default operator is OR,  fq=
> my_text_field:(Jurassic park the movie)  is equivalent to my_text_field:(Jurassic
> OR park OR the OR movie)? That make sense.
>
> On Wed, Jul 11, 2018 at 9:06 AM, Andrea Gazzarini <[hidden email]>
> wrote:
>
>> The syntax is valid in all those three examples, the right one depends on
>> what you need.
>>
>> The first query executes a proximity search (you can think to a phrase
>> search, for simplicity) so it returns no result because probably you don't
>> have any matching docs with that whole literal.
>>
>> The second is querying the my_text_field for all terms which compose the
>> value between parenthesis. You can think to a query where each term is an
>> optional clause, something like mytextfield:jurassic OR
>> mytextfiekd:park...
>> (it's not exactly an OR but this could give you the idea=
>>
>> The third example is not doing what you think. My_text_field is used only
>> with the first term (Jurassic) while the others are using the default
>> field. Something like mytextfield:jurassic OR defaultfield:park OR
>> defaultfield:the.... That's the reason  you have so many results (I guess
>> the default field is a catch-all field)
>>
>> Sorry for typos I'm using my mobile
>>
>> Andrea
>>
>> Il mer 11 lug 2018, 17:54 Wei <[hidden email]> ha scritto:
>>
>> > Hi,
>> >
>> > I am running filter query on a field of text_general type and see
>> > completely different results for the following queries:
>> >
>> >    fq= my_text_field:"Jurassic park the movie"               returns 0
>> > result
>> >
>> >    fq= my_text_field:(Jurassic park the movie)               returns 20
>> > result
>> >
>> >    fq= my_text_field:Jurassic park the movie                  returns
>> > thousands of results
>> >
>> >
>> > Which one is the correct syntax? I am confused why the first query
>> doesn't
>> > have any match at all.  I also thought 2 and 3 are the same, but turns
>> out
>> > quite different.
>> >
>> >
>> > Thanks,
>> > Wei
>> >
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: solr filter query on text field

Erick Erickson
bq.  is there any difference if the fq field is a string field vs test

Absolutely. string fields are not analyzed in any way. They're not
tokenized. There are case sensitive. Etc. For example takd
My dog
as input. A string field will have a _single_ token "My dog.". It will
not match a search on "my". It will not match a search on "dog". It
won't even match "my dog." as a phrase since the case is different. It
won't even match "My dog" because there's no period at the end. It
will only match "My dog.".

As a text field, there would be two tokens, "My" and "dog", and they'd
be massaged however your filters arrange things. With the usual
filters in place (lowerCaseFilter in particular) the tokens in the
index would be "my" and "dog" so searches on "my" would match, "My"
would match, "dog" OR "my" would match

Best,
Erick

On Wed, Jul 11, 2018 at 12:01 PM, Wei <[hidden email]> wrote:

> btw, is there any difference if the fq field is a string field vs test
> field?
>
> On Wed, Jul 11, 2018 at 11:59 AM, Wei <[hidden email]> wrote:
>
>> Thanks Erick and Andrea!  If my default operator is OR,  fq=
>> my_text_field:(Jurassic park the movie)  is equivalent to my_text_field:(Jurassic
>> OR park OR the OR movie)? That make sense.
>>
>> On Wed, Jul 11, 2018 at 9:06 AM, Andrea Gazzarini <[hidden email]>
>> wrote:
>>
>>> The syntax is valid in all those three examples, the right one depends on
>>> what you need.
>>>
>>> The first query executes a proximity search (you can think to a phrase
>>> search, for simplicity) so it returns no result because probably you don't
>>> have any matching docs with that whole literal.
>>>
>>> The second is querying the my_text_field for all terms which compose the
>>> value between parenthesis. You can think to a query where each term is an
>>> optional clause, something like mytextfield:jurassic OR
>>> mytextfiekd:park...
>>> (it's not exactly an OR but this could give you the idea=
>>>
>>> The third example is not doing what you think. My_text_field is used only
>>> with the first term (Jurassic) while the others are using the default
>>> field. Something like mytextfield:jurassic OR defaultfield:park OR
>>> defaultfield:the.... That's the reason  you have so many results (I guess
>>> the default field is a catch-all field)
>>>
>>> Sorry for typos I'm using my mobile
>>>
>>> Andrea
>>>
>>> Il mer 11 lug 2018, 17:54 Wei <[hidden email]> ha scritto:
>>>
>>> > Hi,
>>> >
>>> > I am running filter query on a field of text_general type and see
>>> > completely different results for the following queries:
>>> >
>>> >    fq= my_text_field:"Jurassic park the movie"               returns 0
>>> > result
>>> >
>>> >    fq= my_text_field:(Jurassic park the movie)               returns 20
>>> > result
>>> >
>>> >    fq= my_text_field:Jurassic park the movie                  returns
>>> > thousands of results
>>> >
>>> >
>>> > Which one is the correct syntax? I am confused why the first query
>>> doesn't
>>> > have any match at all.  I also thought 2 and 3 are the same, but turns
>>> out
>>> > quite different.
>>> >
>>> >
>>> > Thanks,
>>> > Wei
>>> >
>>>
>>
>>