Filters and data cleansing

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Filters and data cleansing

Ken Wiltshire
hello experts.

I have what is probably a simple question.  Feels like it should be.  i
have some filters set up on INDEX.  Lets say "lowercasefilterfactory" for
instance.  I understand the data will be indexed as lowercase but when i
qry this same data its still in its original form.  This works for most
instances but I'd also like to filter strings on qry response as well so
that the data returned is scrubbed.  EX:  index:"á"  qry for this document
returns "a" using asIIFolding.  I can see in the analysis that it actually
removes the characters accordingly but when I retrieve the data via qry its
still in its original form.

Any help is appreciated.

Best,
K

____________________
Ken Wiltshire
*VP of Technology*
Shoppable®  <http://www.shoppable.com/>
139 Fulton St.
New York, NY 10038
347 675 5213
Reply | Threaded
Open this post in threaded view
|

Re: Filters and data cleansing

Emir Arnautović
Hi Ken,
What Solr returns is stored value which is original value. Analysis is applied and its result is stored as “index” and is used for searching. In order to get what you want, you have to move analysis at least one step earlier. It can be moved to update request processor chain where you apply analysis on some document field and alter input document, or you move it completely on client side and apply some analysis before constructing document that is sent to Solr.

HTH,
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 15 Apr 2019, at 15:50, Ken Wiltshire <[hidden email]> wrote:
>
> hello experts.
>
> I have what is probably a simple question.  Feels like it should be.  i
> have some filters set up on INDEX.  Lets say "lowercasefilterfactory" for
> instance.  I understand the data will be indexed as lowercase but when i
> qry this same data its still in its original form.  This works for most
> instances but I'd also like to filter strings on qry response as well so
> that the data returned is scrubbed.  EX:  index:"á"  qry for this document
> returns "a" using asIIFolding.  I can see in the analysis that it actually
> removes the characters accordingly but when I retrieve the data via qry its
> still in its original form.
>
> Any help is appreciated.
>
> Best,
> K
>
> ____________________
> Ken Wiltshire
> *VP of Technology*
> Shoppable®  <http://www.shoppable.com/>
> 139 Fulton St.
> New York, NY 10038
> 347 675 5213