SOLR fuzzy search not behaving as expected when analysers are used

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

SOLR fuzzy search not behaving as expected when analysers are used

Razvan  Serban
Hello everyone,

I am using the fuzzy search capability of SOLR 8.7 and I dug into a specific case in which the search misbehaves.

I am using this analyzer (JSON here) on the field that I am using for search

        "analyzer" : {
            "filters":[
                {
                    "class":"solr.ASCIIFoldingFilterFactory",
                    "preserveOriginal":"false"
                },
                {
                    "class":"solr.LowerCaseFilterFactory"
                },
                {
                    "class":"solr.PatternReplaceCharFilterFactory",
                    "replacement":"",
                    "pattern":"[^A-Za-z0-9]"
                }
            ],
            "tokenizer": {
                "class":"solr.KeywordTokenizerFactory"
            }
        }

If the field has the value let's say

abcdefghi

It matches with

a.b.c.d.e.f.g.i

Because those dots inside are discarded due to the PatternReplaceCharFilterFactory.

The problem I have is if instead of normal search I use the fuzzy search. The search term would look like this (with tilde 2 at the end, I am using distance of 2):

a.b.c.d.e.f.g.i~2

This query never matches the original value without dots.

Why is that? I anticipated that the filters are not applied when there is a fuzzy search query running, but the lowercase and the ASCIIFolding ones are working as intended.