Query parser problem, using fuzzy search

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Query parser problem, using fuzzy search

David Frese
Hello everybody,

how can I formulate a fuzzy query that works for an arbitrary string,
resp. is there a formal syntax definition somewhere?

I already found by by hand, that

field:"val"~2

Is read by the parser, but the fuzzyness seems to get lost. So I write

field:val~2

Now if val contain spaces and other special characters, I can escape them:

field:my\ val~2

But now I'm stuck with the term AND:

field:AND~2

Note that I do not want a boolean expression here, but I want to match
the string AND! But the parser complains:

"org.apache.solr.search.SyntaxError: Cannot parse 'field:AND~2':
Encountered \" <AND> \"AND \"\" at line 1, column 4.\nWas expecting one
of:\n    <BAREOPER> ...\n    \"(\" ...\n    \"*\" ...\n    <QUOTED>
...\n    <TERM> ...\n    <PREFIXTERM> ...\n    <WILDTERM> ...\n
<REGEXPTERM> ...\n    \"[\" ...\n    \"{\" ...\n    <LPARAMS> ...\n
\"filter(\" ...\n    <NUMBER> ...\n    ",


Thanks for any hints and help.

--
David Frese
+49 7071 70896 75

Active Group GmbH
Hechinger Str. 12/1, 72072 Tübingen
Registergericht: Amtsgericht Stuttgart, HRB 224404
Geschäftsführer: Dr. Michael Sperber
Reply | Threaded
Open this post in threaded view
|

Re: Query parser problem, using fuzzy search

Erick Erickson
Try searching with lowercase the word and. Somehow you have to allow
the parser to distinguish the two.

You _might_ be able to try "AND~2" (with quotes) to see if you can get
that through the parser. Kind of a hack, but....

There's also a parameter (depending on the parser) about lowercasing
operators, so if and~2 doesn't work check thatl

On Mon, Jan 29, 2018 at 8:32 AM, David Frese
<[hidden email]> wrote:

> Hello everybody,
>
> how can I formulate a fuzzy query that works for an arbitrary string, resp.
> is there a formal syntax definition somewhere?
>
> I already found by by hand, that
>
> field:"val"~2
>
> Is read by the parser, but the fuzzyness seems to get lost. So I write
>
> field:val~2
>
> Now if val contain spaces and other special characters, I can escape them:
>
> field:my\ val~2
>
> But now I'm stuck with the term AND:
>
> field:AND~2
>
> Note that I do not want a boolean expression here, but I want to match the
> string AND! But the parser complains:
>
> "org.apache.solr.search.SyntaxError: Cannot parse 'field:AND~2': Encountered
> \" <AND> \"AND \"\" at line 1, column 4.\nWas expecting one of:\n
> <BAREOPER> ...\n    \"(\" ...\n    \"*\" ...\n    <QUOTED> ...\n    <TERM>
> ...\n    <PREFIXTERM> ...\n    <WILDTERM> ...\n <REGEXPTERM> ...\n    \"[\"
> ...\n    \"{\" ...\n    <LPARAMS> ...\n \"filter(\" ...\n    <NUMBER> ...\n
> ",
>
>
> Thanks for any hints and help.
>
> --
> David Frese
> +49 7071 70896 75
>
> Active Group GmbH
> Hechinger Str. 12/1, 72072 Tübingen
> Registergericht: Amtsgericht Stuttgart, HRB 224404
> Geschäftsführer: Dr. Michael Sperber
Reply | Threaded
Open this post in threaded view
|

Re: Query parser problem, using fuzzy search

David Frese
Am 29.01.18 um 18:05 schrieb Erick Erickson:
> Try searching with lowercase the word and. Somehow you have to allow
> the parser to distinguish the two.

Oh yeah, the biggest unsolved problem in the ~80 years history of
programming languages... NOT ;-)

> You _might_ be able to try "AND~2" (with quotes) to see if you can get
> that through the parser. Kind of a hack, but....

Well, the parser swallows that, but it's not a fuzzy search then anymore.

> There's also a parameter (depending on the parser) about lowercasing
> operators, so if and~2 doesn't work check thatl

And if both appear?

Well, thanks for your ideas - of course you are not the one to blame.

>
> On Mon, Jan 29, 2018 at 8:32 AM, David Frese
> <[hidden email]> wrote:
>> Hello everybody,
>>
>> how can I formulate a fuzzy query that works for an arbitrary string, resp.
>> is there a formal syntax definition somewhere?
>>
>> I already found by by hand, that
>>
>> field:"val"~2
>>
>> Is read by the parser, but the fuzzyness seems to get lost. So I write
>>
>> field:val~2
>>
>> Now if val contain spaces and other special characters, I can escape them:
>>
>> field:my\ val~2
>>
>> But now I'm stuck with the term AND:
>>
>> field:AND~2
>>
>> Note that I do not want a boolean expression here, but I want to match the
>> string AND! But the parser complains:
>>
>> "org.apache.solr.search.SyntaxError: Cannot parse 'field:AND~2': Encountered
>> \" <AND> \"AND \"\" at line 1, column 4.\nWas expecting one of:\n
>> <BAREOPER> ...\n    \"(\" ...\n    \"*\" ...\n    <QUOTED> ...\n    <TERM>
>> ...\n    <PREFIXTERM> ...\n    <WILDTERM> ...\n <REGEXPTERM> ...\n    \"[\"
>> ...\n    \"{\" ...\n    <LPARAMS> ...\n \"filter(\" ...\n    <NUMBER> ...\n
>> ",



--
David Frese
+49 7071 70896 75

Active Group GmbH
Hechinger Str. 12/1, 72072 Tübingen
Registergericht: Amtsgericht Stuttgart, HRB 224404
Geschäftsführer: Dr. Michael Sperber
Reply | Threaded
Open this post in threaded view
|

Re: Query parser problem, using fuzzy search

David Frese
Am 31.01.18 um 16:30 schrieb David Frese:

> Am 29.01.18 um 18:05 schrieb Erick Erickson:
>> Try searching with lowercase the word and. Somehow you have to allow
>> the parser to distinguish the two.
>
> Oh yeah, the biggest unsolved problem in the ~80 years history of
> programming languages... NOT ;-)
>
>> You _might_ be able to try "AND~2" (with quotes) to see if you can get
>> that through the parser. Kind of a hack, but....
>
> Well, the parser swallows that, but it's not a fuzzy search then anymore.
>
>> There's also a parameter (depending on the parser) about lowercasing
>> operators, so if and~2 doesn't work check thatl
>
> And if both appear?
>
> Well, thanks for your ideas - of course you are not the one to blame.
>

If anybody runs into the same problem, I found a possibility:

field:\AND~1

will find documents with field values similar to "AND".



--
David Frese
+49 7071 70896 75

Active Group GmbH
Hechinger Str. 12/1, 72072 Tübingen
Registergericht: Amtsgericht Stuttgart, HRB 224404
Geschäftsführer: Dr. Michael Sperber