How to implement NOTIN operator with Solr

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

How to implement NOTIN operator with Solr

Raboah, Avi
I am trying to find the documents which hit this example:

q=text:"credit" NOTIN "credit card"

for that query I want to get all the documents which contain the term "credit" but not as part of the phrase "credit card".

so:

1.     I don't want to get the documents which include just "credit card".

2.     I want to get the documents which include just "credit".

3.     I want to get the documents which include "credit" but not as part of credit card.



for example:

doc1 text: "I want to buy with my credit in my card"

doc2 text: "I want to buy with my credit in my credit card"

doc3 text: "I want to buy with my credit card"

The documents should be returned:

doc1, doc2

I can't find nothing about NOTIN operator implementation in SOLR docs.



This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.
Reply | Threaded
Open this post in threaded view
|

Re: How to implement NOTIN operator with Solr

Vincenzo D'Amore
This is a tricky problem, you’re trying to handle the meani using the words. A simple solution could be apply a synonym filter that convert “credit card” in two terms: “creditcard” and “card”. In this way searching credit will not match any term.

Ciao,
Vincenzo

--
mobile: 3498513251
skype: free.dev

> On 19 Nov 2019, at 10:30, Raboah, Avi <[hidden email]> wrote:
>
> I am trying to find the documents which hit this example:
>
> q=text:"credit" NOTIN "credit card"
>
> for that query I want to get all the documents which contain the term "credit" but not as part of the phrase "credit card".
>
> so:
>
> 1.     I don't want to get the documents which include just "credit card".
>
> 2.     I want to get the documents which include just "credit".
>
> 3.     I want to get the documents which include "credit" but not as part of credit card.
>
>
>
> for example:
>
> doc1 text: "I want to buy with my credit in my card"
>
> doc2 text: "I want to buy with my credit in my credit card"
>
> doc3 text: "I want to buy with my credit card"
>
> The documents should be returned:
>
> doc1, doc2
>
> I can't find nothing about NOTIN operator implementation in SOLR docs.
>
>
>
> This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.
Reply | Threaded
Open this post in threaded view
|

Re: How to implement NOTIN operator with Solr

Emir Arnautović
In reply to this post by Raboah, Avi
Hi Avi,
There are span queries, but in this case you don’t need it. It is enough to simply filter out documents that are with “credit card”. Your query can be something like
+text:credit -text:”credit card”
If you prefer using boolean operators, you can write it as:
text:credit AND NOT text: “credit card”

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 19 Nov 2019, at 10:30, Raboah, Avi <[hidden email]> wrote:
>
> I am trying to find the documents which hit this example:
>
> q=text:"credit" NOTIN "credit card"
>
> for that query I want to get all the documents which contain the term "credit" but not as part of the phrase "credit card".
>
> so:
>
> 1.     I don't want to get the documents which include just "credit card".
>
> 2.     I want to get the documents which include just "credit".
>
> 3.     I want to get the documents which include "credit" but not as part of credit card.
>
>
>
> for example:
>
> doc1 text: "I want to buy with my credit in my card"
>
> doc2 text: "I want to buy with my credit in my credit card"
>
> doc3 text: "I want to buy with my credit card"
>
> The documents should be returned:
>
> doc1, doc2
>
> I can't find nothing about NOTIN operator implementation in SOLR docs.
>
>
>
> This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.

Reply | Threaded
Open this post in threaded view
|

RE: How to implement NOTIN operator with Solr

Raboah, Avi
In that case I got only doc1

-----Original Message-----
From: Emir Arnautović [mailto:[hidden email]]
Sent: Tuesday, November 19, 2019 11:51 AM
To: [hidden email]
Subject: Re: How to implement NOTIN operator with Solr

Hi Avi,
There are span queries, but in this case you don’t need it. It is enough to simply filter out documents that are with “credit card”. Your query can be something like
+text:credit -text:”credit card”
If you prefer using boolean operators, you can write it as:
text:credit AND NOT text: “credit card”

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 19 Nov 2019, at 10:30, Raboah, Avi <[hidden email]> wrote:
>
> I am trying to find the documents which hit this example:
>
> q=text:"credit" NOTIN "credit card"
>
> for that query I want to get all the documents which contain the term "credit" but not as part of the phrase "credit card".
>
> so:
>
> 1.     I don't want to get the documents which include just "credit card".
>
> 2.     I want to get the documents which include just "credit".
>
> 3.     I want to get the documents which include "credit" but not as part of credit card.
>
>
>
> for example:
>
> doc1 text: "I want to buy with my credit in my card"
>
> doc2 text: "I want to buy with my credit in my credit card"
>
> doc3 text: "I want to buy with my credit card"
>
> The documents should be returned:
>
> doc1, doc2
>
> I can't find nothing about NOTIN operator implementation in SOLR docs.
>
>
>
> This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.



This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.
Reply | Threaded
Open this post in threaded view
|

Re: How to implement NOTIN operator with Solr

Alexandre Rafalovitch
I think the main question here is the compound word "credit card"
always the same? If yes, you can preprocess it during indexing to
something unique and discard (see Vincenzo's reply). You could even
copyfield and process the copy to only leave standalone word "credit"
in it, so it basically serves as a boolean presence marker.

But if it can change for every search, you have to do it during query
only. I suspect span queries can detect something like this, but don't
have a reference example. I suspect it would be either with:
*) Surround Query Parser:
https://lucene.apache.org/solr/guide/8_3/other-parsers.html#surround-query-parser
or directly with
*) XML Query Parser:
https://lucene.apache.org/solr/guide/8_3/other-parsers.html#xml-query-parser

Once you figured the syntax out, you should be able to substitute
values with variables and perhaps even push the long syntax into a
separate Query Handler, so you just pass "yes word" and "no phrase" to
Solr and have it construct longer query.

Please do let us know when you figure it out. I think other people
were interested in the similar problem before.

Regards,
   Alex.

On Tue, 19 Nov 2019 at 05:08, Raboah, Avi <[hidden email]> wrote:

>
> In that case I got only doc1
>
> -----Original Message-----
> From: Emir Arnautović [mailto:[hidden email]]
> Sent: Tuesday, November 19, 2019 11:51 AM
> To: [hidden email]
> Subject: Re: How to implement NOTIN operator with Solr
>
> Hi Avi,
> There are span queries, but in this case you don’t need it. It is enough to simply filter out documents that are with “credit card”. Your query can be something like
> +text:credit -text:”credit card”
> If you prefer using boolean operators, you can write it as:
> text:credit AND NOT text: “credit card”
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 19 Nov 2019, at 10:30, Raboah, Avi <[hidden email]> wrote:
> >
> > I am trying to find the documents which hit this example:
> >
> > q=text:"credit" NOTIN "credit card"
> >
> > for that query I want to get all the documents which contain the term "credit" but not as part of the phrase "credit card".
> >
> > so:
> >
> > 1.     I don't want to get the documents which include just "credit card".
> >
> > 2.     I want to get the documents which include just "credit".
> >
> > 3.     I want to get the documents which include "credit" but not as part of credit card.
> >
> >
> >
> > for example:
> >
> > doc1 text: "I want to buy with my credit in my card"
> >
> > doc2 text: "I want to buy with my credit in my credit card"
> >
> > doc3 text: "I want to buy with my credit card"
> >
> > The documents should be returned:
> >
> > doc1, doc2
> >
> > I can't find nothing about NOTIN operator implementation in SOLR docs.
> >
> >
> >
> > This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.
>
>
>
> This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.
Reply | Threaded
Open this post in threaded view
|

Re: How to implement NOTIN operator with Solr

Emir Arnautović
In reply to this post by Raboah, Avi
Right - didn’t read all your examples. In that case you can use span queries. In this case complexphrase query parser should do the trick:
{!complexphrase df=text}”credit -card”

Regards,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 19 Nov 2019, at 11:08, Raboah, Avi <[hidden email]> wrote:
>
> In that case I got only doc1
>
> -----Original Message-----
> From: Emir Arnautović [mailto:[hidden email]]
> Sent: Tuesday, November 19, 2019 11:51 AM
> To: [hidden email]
> Subject: Re: How to implement NOTIN operator with Solr
>
> Hi Avi,
> There are span queries, but in this case you don’t need it. It is enough to simply filter out documents that are with “credit card”. Your query can be something like
> +text:credit -text:”credit card”
> If you prefer using boolean operators, you can write it as:
> text:credit AND NOT text: “credit card”
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
>> On 19 Nov 2019, at 10:30, Raboah, Avi <[hidden email]> wrote:
>>
>> I am trying to find the documents which hit this example:
>>
>> q=text:"credit" NOTIN "credit card"
>>
>> for that query I want to get all the documents which contain the term "credit" but not as part of the phrase "credit card".
>>
>> so:
>>
>> 1.     I don't want to get the documents which include just "credit card".
>>
>> 2.     I want to get the documents which include just "credit".
>>
>> 3.     I want to get the documents which include "credit" but not as part of credit card.
>>
>>
>>
>> for example:
>>
>> doc1 text: "I want to buy with my credit in my card"
>>
>> doc2 text: "I want to buy with my credit in my credit card"
>>
>> doc3 text: "I want to buy with my credit card"
>>
>> The documents should be returned:
>>
>> doc1, doc2
>>
>> I can't find nothing about NOTIN operator implementation in SOLR docs.
>>
>>
>>
>> This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.
>
>
>
> This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.

Reply | Threaded
Open this post in threaded view
|

RE: How to implement NOTIN operator with Solr

Raboah, Avi
It's working!!! thanks a lot :)

-----Original Message-----
From: Emir Arnautović [mailto:[hidden email]]
Sent: Tuesday, November 19, 2019 2:54 PM
To: [hidden email]
Subject: Re: How to implement NOTIN operator with Solr

Right - didn’t read all your examples. In that case you can use span queries. In this case complexphrase query parser should do the trick:
{!complexphrase df=text}”credit -card”

Regards,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 19 Nov 2019, at 11:08, Raboah, Avi <[hidden email]> wrote:
>
> In that case I got only doc1
>
> -----Original Message-----
> From: Emir Arnautović [mailto:[hidden email]]
> Sent: Tuesday, November 19, 2019 11:51 AM
> To: [hidden email]
> Subject: Re: How to implement NOTIN operator with Solr
>
> Hi Avi,
> There are span queries, but in this case you don’t need it. It is
> enough to simply filter out documents that are with “credit card”.
> Your query can be something like
> +text:credit -text:”credit card”
> If you prefer using boolean operators, you can write it as:
> text:credit AND NOT text: “credit card”
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection Solr &
> Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
>> On 19 Nov 2019, at 10:30, Raboah, Avi <[hidden email]> wrote:
>>
>> I am trying to find the documents which hit this example:
>>
>> q=text:"credit" NOTIN "credit card"
>>
>> for that query I want to get all the documents which contain the term "credit" but not as part of the phrase "credit card".
>>
>> so:
>>
>> 1.     I don't want to get the documents which include just "credit card".
>>
>> 2.     I want to get the documents which include just "credit".
>>
>> 3.     I want to get the documents which include "credit" but not as part of credit card.
>>
>>
>>
>> for example:
>>
>> doc1 text: "I want to buy with my credit in my card"
>>
>> doc2 text: "I want to buy with my credit in my credit card"
>>
>> doc3 text: "I want to buy with my credit card"
>>
>> The documents should be returned:
>>
>> doc1, doc2
>>
>> I can't find nothing about NOTIN operator implementation in SOLR docs.
>>
>>
>>
>> This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.
>
>
>
> This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.



This electronic message may contain proprietary and confidential information of Verint Systems Inc., its affiliates and/or subsidiaries. The information is intended to be for the use of the individual(s) or entity(ies) named above. If you are not the intended recipient (or authorized to receive this e-mail for the intended recipient), you may not use, copy, disclose or distribute to anyone this message or any information contained in this message. If you have received this electronic message in error, please notify us by replying to this e-mail.