Is it possible to have different Stop words depending on the value of a field?

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Is it possible to have different Stop words depending on the value of a field?

yeikel
Hi,


I have an index that stores addresses from different countries.


As every country has different stop words, I was wondering if it is possible to apply a different set of stop words depending on the value of a field. 


Or do I need different indexes/do itnat the ETL step to accomplish this?


Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to have different Stop words depending on the value of a field?

Jörn Franke
You can have different fields by country. I am not sure about your stop words but if they are not occurring in the other languages then you have not a problem.
On the other hand: it you need more than stop words (eg lemmatizing, specialized way of tokenization etc) then you need a different field per language. You don’t describe your full use case, but if you have different fields for different language then your client application needs to handle this (not difficult, but you have to be aware).
Not sure if you need to search a given address in all languages or if you use the language of the user etc.

> Am 02.12.2019 um 20:13 schrieb yeikel valdes <[hidden email]>:
>
> Hi,
>
>
> I have an index that stores addresses from different countries.
>
>
> As every country has different stop words, I was wondering if it is possible to apply a different set of stop words depending on the value of a field.
>
>
> Or do I need different indexes/do itnat the ETL step to accomplish this?
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to have different Stop words depending on the value of a field?

Walter Underwood
The best approach is to not use stop words at all. That gives better relevance with less configuration, so it is a total win.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Dec 2, 2019, at 12:24 PM, Jörn Franke <[hidden email]> wrote:
>
> You can have different fields by country. I am not sure about your stop words but if they are not occurring in the other languages then you have not a problem.
> On the other hand: it you need more than stop words (eg lemmatizing, specialized way of tokenization etc) then you need a different field per language. You don’t describe your full use case, but if you have different fields for different language then your client application needs to handle this (not difficult, but you have to be aware).
> Not sure if you need to search a given address in all languages or if you use the language of the user etc.
>
>> Am 02.12.2019 um 20:13 schrieb yeikel valdes <[hidden email]>:
>>
>> Hi,
>>
>>
>> I have an index that stores addresses from different countries.
>>
>>
>> As every country has different stop words, I was wondering if it is possible to apply a different set of stop words depending on the value of a field.
>>
>>
>> Or do I need different indexes/do itnat the ETL step to accomplish this?
>>
>>

Reply | Threaded
Open this post in threaded view
|

RE: Is it possible to have different Stop words depending on the value of a field?

yeikel
In reply to this post by Jörn Franke
To clarify, a document would look like this :

{
  address: "123 main Street",
  country : "US"
}

What I'd like to do when I configure my index is to apply a set of different stop words to the address field depending on the value of the country. For example, something like this :

If (country == US) -> File1
Else If (country == UK) -> File2

Etc..

Hopefully, that clarifies.

-----Original Message-----
From: Jörn Franke <[hidden email]>
Sent: Monday, December 2, 2019 3:25 PM
To: [hidden email]
Subject: Re: Is it possible to have different Stop words depending on the value of a field?

You can have different fields by country. I am not sure about your stop words but if they are not occurring in the other languages then you have not a problem.
On the other hand: it you need more than stop words (eg lemmatizing, specialized way of tokenization etc) then you need a different field per language. You don’t describe your full use case, but if you have different fields for different language then your client application needs to handle this (not difficult, but you have to be aware).
Not sure if you need to search a given address in all languages or if you use the language of the user etc.

> Am 02.12.2019 um 20:13 schrieb yeikel valdes <[hidden email]>:
>
> Hi,
>
>
> I have an index that stores addresses from different countries.
>
>
> As every country has different stop words, I was wondering if it is possible to apply a different set of stop words depending on the value of a field.
>
>
> Or do I need different indexes/do itnat the ETL step to accomplish this?
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to have different Stop words depending on the value of a field?

David Hastings
It clarifies yes. You need new fields. In this case something like
Address_us
Address_uk
And index and search them accordingly with different stopword files used in different field types, hence the copy field from “address” into as many new fields as needed

> On Dec 2, 2019, at 7:33 PM, <[hidden email]> <[hidden email]> wrote:
>
> To clarify, a document would look like this :
>
> {
>  address: "123 main Street",
>  country : "US"
> }
>
> What I'd like to do when I configure my index is to apply a set of different stop words to the address field depending on the value of the country. For example, something like this :
>
> If (country == US) -> File1
> Else If (country == UK) -> File2
>
> Etc..
>
> Hopefully, that clarifies.
>
> -----Original Message-----
> From: Jörn Franke <[hidden email]>
> Sent: Monday, December 2, 2019 3:25 PM
> To: [hidden email]
> Subject: Re: Is it possible to have different Stop words depending on the value of a field?
>
> You can have different fields by country. I am not sure about your stop words but if they are not occurring in the other languages then you have not a problem.
> On the other hand: it you need more than stop words (eg lemmatizing, specialized way of tokenization etc) then you need a different field per language. You don’t describe your full use case, but if you have different fields for different language then your client application needs to handle this (not difficult, but you have to be aware).
> Not sure if you need to search a given address in all languages or if you use the language of the user etc.
>
>> Am 02.12.2019 um 20:13 schrieb yeikel valdes <[hidden email]>:
>>
>> Hi,
>>
>>
>> I have an index that stores addresses from different countries.
>>
>>
>> As every country has different stop words, I was wondering if it is possible to apply a different set of stop words depending on the value of a field.
>>
>>
>> Or do I need different indexes/do itnat the ETL step to accomplish this?
>>
>>
>
>
Reply | Threaded
Open this post in threaded view
|

RE: Is it possible to have different Stop words depending on the value of a field?

yeikel
That makes sense, thank you for the clarification!

@[hidden email] If you can, please build on your explanation as It sounds relevant.
-----Original Message-----
From: Dave <[hidden email]>
Sent: Monday, December 2, 2019 7:38 PM
To: [hidden email]
Cc: [hidden email]
Subject: Re: Is it possible to have different Stop words depending on the value of a field?

It clarifies yes. You need new fields. In this case something like Address_us Address_uk And index and search them accordingly with different stopword files used in different field types, hence the copy field from “address” into as many new fields as needed

> On Dec 2, 2019, at 7:33 PM, <[hidden email]> <[hidden email]> wrote:
>
> To clarify, a document would look like this :
>
> {
>  address: "123 main Street",
>  country : "US"
> }
>
> What I'd like to do when I configure my index is to apply a set of different stop words to the address field depending on the value of the country. For example, something like this :
>
> If (country == US) -> File1
> Else If (country == UK) -> File2
>
> Etc..
>
> Hopefully, that clarifies.
>
> -----Original Message-----
> From: Jörn Franke <[hidden email]>
> Sent: Monday, December 2, 2019 3:25 PM
> To: [hidden email]
> Subject: Re: Is it possible to have different Stop words depending on the value of a field?
>
> You can have different fields by country. I am not sure about your stop words but if they are not occurring in the other languages then you have not a problem.
> On the other hand: it you need more than stop words (eg lemmatizing, specialized way of tokenization etc) then you need a different field per language. You don’t describe your full use case, but if you have different fields for different language then your client application needs to handle this (not difficult, but you have to be aware).
> Not sure if you need to search a given address in all languages or if you use the language of the user etc.
>
>> Am 02.12.2019 um 20:13 schrieb yeikel valdes <[hidden email]>:
>>
>> Hi,
>>
>>
>> I have an index that stores addresses from different countries.
>>
>>
>> As every country has different stop words, I was wondering if it is possible to apply a different set of stop words depending on the value of a field.
>>
>>
>> Or do I need different indexes/do itnat the ETL step to accomplish this?
>>
>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to have different Stop words depending on the value of a field?

David Hastings
I’ll add to that since I’m up. Stopwords are in a practical sense useless and serve no purpose. It’s an old way to save index size that’s not needed any more. You’d need very specific use cases to want to use them. Maybe you do, but generally you never do unless it’s for training a machine or something a bit more on the experimental side. If you can explain *why you think you need stop words that would be helpful in perhaps guiding you to an alternative

> On Dec 2, 2019, at 7:45 PM, <[hidden email]> <[hidden email]> wrote:
>
> That makes sense, thank you for the clarification!
>
> @[hidden email] If you can, please build on your explanation as It sounds relevant.
> -----Original Message-----
> From: Dave <[hidden email]>
> Sent: Monday, December 2, 2019 7:38 PM
> To: [hidden email]
> Cc: [hidden email]
> Subject: Re: Is it possible to have different Stop words depending on the value of a field?
>
> It clarifies yes. You need new fields. In this case something like Address_us Address_uk And index and search them accordingly with different stopword files used in different field types, hence the copy field from “address” into as many new fields as needed
>
>> On Dec 2, 2019, at 7:33 PM, <[hidden email]> <[hidden email]> wrote:
>>
>> To clarify, a document would look like this :
>>
>> {
>> address: "123 main Street",
>> country : "US"
>> }
>>
>> What I'd like to do when I configure my index is to apply a set of different stop words to the address field depending on the value of the country. For example, something like this :
>>
>> If (country == US) -> File1
>> Else If (country == UK) -> File2
>>
>> Etc..
>>
>> Hopefully, that clarifies.
>>
>> -----Original Message-----
>> From: Jörn Franke <[hidden email]>
>> Sent: Monday, December 2, 2019 3:25 PM
>> To: [hidden email]
>> Subject: Re: Is it possible to have different Stop words depending on the value of a field?
>>
>> You can have different fields by country. I am not sure about your stop words but if they are not occurring in the other languages then you have not a problem.
>> On the other hand: it you need more than stop words (eg lemmatizing, specialized way of tokenization etc) then you need a different field per language. You don’t describe your full use case, but if you have different fields for different language then your client application needs to handle this (not difficult, but you have to be aware).
>> Not sure if you need to search a given address in all languages or if you use the language of the user etc.
>>
>>> Am 02.12.2019 um 20:13 schrieb yeikel valdes <[hidden email]>:
>>>
>>> Hi,
>>>
>>>
>>> I have an index that stores addresses from different countries.
>>>
>>>
>>> As every country has different stop words, I was wondering if it is possible to apply a different set of stop words depending on the value of a field.
>>>
>>>
>>> Or do I need different indexes/do itnat the ETL step to accomplish this?
>>>
>>>
>>
>>
>
>
Reply | Threaded
Open this post in threaded view
|

RE: Is it possible to have different Stop words depending on the value of a field?

yeikel
Thank you for jumping in @[hidden email]

I have an index with raw addresses in a nonstandardized format such as "123 main street" or "main street 123", and I am looking to search this index and pull the closest addresses from another raw input with a similar unpredictable format. Ideally, I am trying to reduce the number of results as much as possible because of time constraints.

At the moment, I am launching a dismax query with the mm(minimum should match) parameter set to a value I am comfortable with(say 50% for example).

In an address such as "123 main street CA 90201 US" , if I execute a query such as: "return addresses that match 50% of the tokens"(dismax,with mm set to 50%),  I will potentially get records with "US Street 123" or "main street CA", which is not something that I am looking for. I understand that I could increase the mm parameter and set it to say "100%", but again, I am not sure if the token "street" should be considered when calculating the mm parameter as I could miss a record such as "123 main CA 90201 US"

For longer addresses, the relevance of "main" or "street" is much lower than keywords such as apartment number or the city.

I am not sure if this is the right way to search for unstructured addresses so we are open for suggestions.

Thank you

-----Original Message-----
From: Dave <[hidden email]>
Sent: Monday, December 2, 2019 7:50 PM
To: [hidden email]
Cc: [hidden email]; [hidden email]
Subject: Re: Is it possible to have different Stop words depending on the value of a field?

I’ll add to that since I’m up. Stopwords are in a practical sense useless and serve no purpose. It’s an old way to save index size that’s not needed any more. You’d need very specific use cases to want to use them. Maybe you do, but generally you never do unless it’s for training a machine or something a bit more on the experimental side. If you can explain *why you think you need stop words that would be helpful in perhaps guiding you to an alternative

> On Dec 2, 2019, at 7:45 PM, <[hidden email]> <[hidden email]> wrote:
>
> That makes sense, thank you for the clarification!
>
> @[hidden email] If you can, please build on your explanation as It sounds relevant.
> -----Original Message-----
> From: Dave <[hidden email]>
> Sent: Monday, December 2, 2019 7:38 PM
> To: [hidden email]
> Cc: [hidden email]
> Subject: Re: Is it possible to have different Stop words depending on the value of a field?
>
> It clarifies yes. You need new fields. In this case something like Address_us Address_uk And index and search them accordingly with different stopword files used in different field types, hence the copy field from “address” into as many new fields as needed
>
>> On Dec 2, 2019, at 7:33 PM, <[hidden email]> <[hidden email]> wrote:
>>
>> To clarify, a document would look like this :
>>
>> {
>> address: "123 main Street",
>> country : "US"
>> }
>>
>> What I'd like to do when I configure my index is to apply a set of different stop words to the address field depending on the value of the country. For example, something like this :
>>
>> If (country == US) -> File1
>> Else If (country == UK) -> File2
>>
>> Etc..
>>
>> Hopefully, that clarifies.
>>
>> -----Original Message-----
>> From: Jörn Franke <[hidden email]>
>> Sent: Monday, December 2, 2019 3:25 PM
>> To: [hidden email]
>> Subject: Re: Is it possible to have different Stop words depending on the value of a field?
>>
>> You can have different fields by country. I am not sure about your stop words but if they are not occurring in the other languages then you have not a problem.
>> On the other hand: it you need more than stop words (eg lemmatizing, specialized way of tokenization etc) then you need a different field per language. You don’t describe your full use case, but if you have different fields for different language then your client application needs to handle this (not difficult, but you have to be aware).
>> Not sure if you need to search a given address in all languages or if you use the language of the user etc.
>>
>>> Am 02.12.2019 um 20:13 schrieb yeikel valdes <[hidden email]>:
>>>
>>> Hi,
>>>
>>>
>>> I have an index that stores addresses from different countries.
>>>
>>>
>>> As every country has different stop words, I was wondering if it is possible to apply a different set of stop words depending on the value of a field.
>>>
>>>
>>> Or do I need different indexes/do itnat the ETL step to accomplish this?
>>>
>>>
>>
>>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to have different Stop words depending on the value of a field?

Paras Lehana
Hi Yeikel,

I want to stress on three things:

   1. If you know the probable words which can be written in different ways
   (like street), you can use Synonyms.

   2. The longer queries can have different mm's. The mm parameter supports
   different values for different word lengths. We generally do 100% mm match
   for 2 words, decrease it words-1 for words > 2 and 70% for words > 7.

   3. The returned numDocs should not heavily impact your response time.
   You can always use rows parameter to decrease the result set. Is your issue
   regarding the ranking of documents or the number of documents? Please give
   examples of the results that you don't want to get fetched for a query.


On Tue, 3 Dec 2019 at 10:13, <[hidden email]> wrote:

> Thank you for jumping in @[hidden email]
>
> I have an index with raw addresses in a nonstandardized format such as
> "123 main street" or "main street 123", and I am looking to search this
> index and pull the closest addresses from another raw input with a similar
> unpredictable format. Ideally, I am trying to reduce the number of results
> as much as possible because of time constraints.
>
> At the moment, I am launching a dismax query with the mm(minimum should
> match) parameter set to a value I am comfortable with(say 50% for example).
>
> In an address such as "123 main street CA 90201 US" , if I execute a query
> such as: "return addresses that match 50% of the tokens"(dismax,with mm set
> to 50%),  I will potentially get records with "US Street 123" or "main
> street CA", which is not something that I am looking for. I understand that
> I could increase the mm parameter and set it to say "100%", but again, I am
> not sure if the token "street" should be considered when calculating the mm
> parameter as I could miss a record such as "123 main CA 90201 US"
>
> For longer addresses, the relevance of "main" or "street" is much lower
> than keywords such as apartment number or the city.
>
> I am not sure if this is the right way to search for unstructured
> addresses so we are open for suggestions.
>
> Thank you
>
> -----Original Message-----
> From: Dave <[hidden email]>
> Sent: Monday, December 2, 2019 7:50 PM
> To: [hidden email]
> Cc: [hidden email]; [hidden email]
> Subject: Re: Is it possible to have different Stop words depending on the
> value of a field?
>
> I’ll add to that since I’m up. Stopwords are in a practical sense useless
> and serve no purpose. It’s an old way to save index size that’s not needed
> any more. You’d need very specific use cases to want to use them. Maybe you
> do, but generally you never do unless it’s for training a machine or
> something a bit more on the experimental side. If you can explain *why you
> think you need stop words that would be helpful in perhaps guiding you to
> an alternative
>
> > On Dec 2, 2019, at 7:45 PM, <[hidden email]> <[hidden email]> wrote:
> >
> > That makes sense, thank you for the clarification!
> >
> > @[hidden email] If you can, please build on your explanation as
> It sounds relevant.
> > -----Original Message-----
> > From: Dave <[hidden email]>
> > Sent: Monday, December 2, 2019 7:38 PM
> > To: [hidden email]
> > Cc: [hidden email]
> > Subject: Re: Is it possible to have different Stop words depending on
> the value of a field?
> >
> > It clarifies yes. You need new fields. In this case something like
> Address_us Address_uk And index and search them accordingly with different
> stopword files used in different field types, hence the copy field from
> “address” into as many new fields as needed
> >
> >> On Dec 2, 2019, at 7:33 PM, <[hidden email]> <[hidden email]>
> wrote:
> >>
> >> To clarify, a document would look like this :
> >>
> >> {
> >> address: "123 main Street",
> >> country : "US"
> >> }
> >>
> >> What I'd like to do when I configure my index is to apply a set of
> different stop words to the address field depending on the value of the
> country. For example, something like this :
> >>
> >> If (country == US) -> File1
> >> Else If (country == UK) -> File2
> >>
> >> Etc..
> >>
> >> Hopefully, that clarifies.
> >>
> >> -----Original Message-----
> >> From: Jörn Franke <[hidden email]>
> >> Sent: Monday, December 2, 2019 3:25 PM
> >> To: [hidden email]
> >> Subject: Re: Is it possible to have different Stop words depending on
> the value of a field?
> >>
> >> You can have different fields by country. I am not sure about your stop
> words but if they are not occurring in the other languages then you have
> not a problem.
> >> On the other hand: it you need more than stop words (eg lemmatizing,
> specialized way of tokenization etc) then you need a different field per
> language. You don’t describe your full use case, but if you have different
> fields for different language then your client application needs to handle
> this (not difficult, but you have to be aware).
> >> Not sure if you need to search a given address in all languages or if
> you use the language of the user etc.
> >>
> >>> Am 02.12.2019 um 20:13 schrieb yeikel valdes <[hidden email]>:
> >>>
> >>> Hi,
> >>>
> >>>
> >>> I have an index that stores addresses from different countries.
> >>>
> >>>
> >>> As every country has different stop words, I was wondering if it is
> possible to apply a different set of stop words depending on the value of a
> field.
> >>>
> >>>
> >>> Or do I need different indexes/do itnat the ETL step to accomplish
> this?
> >>>
> >>>
> >>
> >>
> >
> >
>
>
>

--
--
Regards,

*Paras Lehana* [65871]
Development Engineer, Auto-Suggest,
IndiaMART Intermesh Ltd.

8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
Noida, UP, IN - 201303

Mob.: +91-9560911996
Work: 01203916600 | Extn:  *8173*

--
*
*

 <https://www.facebook.com/IndiaMART/videos/578196442936091/>
Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to have different Stop words depending on the value of a field?

Emir Arnautović
Hi,
I’ve spent quite a lot time working on a similar issue but I did not think about it much since (at the time it was Solr 1.3) so some new features could push me to some other direction, but here is what I remember: You cannot rely on users entering standardised address format even within one country. Users will use both abbreviations and full names. If you need to support Japan - good luck. India is a similar story. You might want to preprocess input and do some entity extraction and parsing both at index time and query time. Solr scoring is not good enough for addresses - it is good for giving you candidates but after that you need to apply custom scoring function on either Solr or client side. If you have ability to use full blown geocoder, use it at both index and query time - you can even store multiple geocoding results with scores and use those scores to calculate final score. The good thing is that Solr has many extension points and I’ve used almost all but unfortunately, those were proprietary plugins and was not able to persuade client to open source it.

Good Luck,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 4 Dec 2019, at 08:13, Paras Lehana <[hidden email]> wrote:
>
> Hi Yeikel,
>
> I want to stress on three things:
>
>   1. If you know the probable words which can be written in different ways
>   (like street), you can use Synonyms.
>
>   2. The longer queries can have different mm's. The mm parameter supports
>   different values for different word lengths. We generally do 100% mm match
>   for 2 words, decrease it words-1 for words > 2 and 70% for words > 7.
>
>   3. The returned numDocs should not heavily impact your response time.
>   You can always use rows parameter to decrease the result set. Is your issue
>   regarding the ranking of documents or the number of documents? Please give
>   examples of the results that you don't want to get fetched for a query.
>
>
> On Tue, 3 Dec 2019 at 10:13, <[hidden email]> wrote:
>
>> Thank you for jumping in @[hidden email]
>>
>> I have an index with raw addresses in a nonstandardized format such as
>> "123 main street" or "main street 123", and I am looking to search this
>> index and pull the closest addresses from another raw input with a similar
>> unpredictable format. Ideally, I am trying to reduce the number of results
>> as much as possible because of time constraints.
>>
>> At the moment, I am launching a dismax query with the mm(minimum should
>> match) parameter set to a value I am comfortable with(say 50% for example).
>>
>> In an address such as "123 main street CA 90201 US" , if I execute a query
>> such as: "return addresses that match 50% of the tokens"(dismax,with mm set
>> to 50%),  I will potentially get records with "US Street 123" or "main
>> street CA", which is not something that I am looking for. I understand that
>> I could increase the mm parameter and set it to say "100%", but again, I am
>> not sure if the token "street" should be considered when calculating the mm
>> parameter as I could miss a record such as "123 main CA 90201 US"
>>
>> For longer addresses, the relevance of "main" or "street" is much lower
>> than keywords such as apartment number or the city.
>>
>> I am not sure if this is the right way to search for unstructured
>> addresses so we are open for suggestions.
>>
>> Thank you
>>
>> -----Original Message-----
>> From: Dave <[hidden email]>
>> Sent: Monday, December 2, 2019 7:50 PM
>> To: [hidden email]
>> Cc: [hidden email]; [hidden email]
>> Subject: Re: Is it possible to have different Stop words depending on the
>> value of a field?
>>
>> I’ll add to that since I’m up. Stopwords are in a practical sense useless
>> and serve no purpose. It’s an old way to save index size that’s not needed
>> any more. You’d need very specific use cases to want to use them. Maybe you
>> do, but generally you never do unless it’s for training a machine or
>> something a bit more on the experimental side. If you can explain *why you
>> think you need stop words that would be helpful in perhaps guiding you to
>> an alternative
>>
>>> On Dec 2, 2019, at 7:45 PM, <[hidden email]> <[hidden email]> wrote:
>>>
>>> That makes sense, thank you for the clarification!
>>>
>>> @[hidden email] If you can, please build on your explanation as
>> It sounds relevant.
>>> -----Original Message-----
>>> From: Dave <[hidden email]>
>>> Sent: Monday, December 2, 2019 7:38 PM
>>> To: [hidden email]
>>> Cc: [hidden email]
>>> Subject: Re: Is it possible to have different Stop words depending on
>> the value of a field?
>>>
>>> It clarifies yes. You need new fields. In this case something like
>> Address_us Address_uk And index and search them accordingly with different
>> stopword files used in different field types, hence the copy field from
>> “address” into as many new fields as needed
>>>
>>>> On Dec 2, 2019, at 7:33 PM, <[hidden email]> <[hidden email]>
>> wrote:
>>>>
>>>> To clarify, a document would look like this :
>>>>
>>>> {
>>>> address: "123 main Street",
>>>> country : "US"
>>>> }
>>>>
>>>> What I'd like to do when I configure my index is to apply a set of
>> different stop words to the address field depending on the value of the
>> country. For example, something like this :
>>>>
>>>> If (country == US) -> File1
>>>> Else If (country == UK) -> File2
>>>>
>>>> Etc..
>>>>
>>>> Hopefully, that clarifies.
>>>>
>>>> -----Original Message-----
>>>> From: Jörn Franke <[hidden email]>
>>>> Sent: Monday, December 2, 2019 3:25 PM
>>>> To: [hidden email]
>>>> Subject: Re: Is it possible to have different Stop words depending on
>> the value of a field?
>>>>
>>>> You can have different fields by country. I am not sure about your stop
>> words but if they are not occurring in the other languages then you have
>> not a problem.
>>>> On the other hand: it you need more than stop words (eg lemmatizing,
>> specialized way of tokenization etc) then you need a different field per
>> language. You don’t describe your full use case, but if you have different
>> fields for different language then your client application needs to handle
>> this (not difficult, but you have to be aware).
>>>> Not sure if you need to search a given address in all languages or if
>> you use the language of the user etc.
>>>>
>>>>> Am 02.12.2019 um 20:13 schrieb yeikel valdes <[hidden email]>:
>>>>>
>>>>> Hi,
>>>>>
>>>>>
>>>>> I have an index that stores addresses from different countries.
>>>>>
>>>>>
>>>>> As every country has different stop words, I was wondering if it is
>> possible to apply a different set of stop words depending on the value of a
>> field.
>>>>>
>>>>>
>>>>> Or do I need different indexes/do itnat the ETL step to accomplish
>> this?
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
> --
> --
> Regards,
>
> *Paras Lehana* [65871]
> Development Engineer, Auto-Suggest,
> IndiaMART Intermesh Ltd.
>
> 8th Floor, Tower A, Advant-Navis Business Park, Sector 142,
> Noida, UP, IN - 201303
>
> Mob.: +91-9560911996
> Work: 01203916600 | Extn:  *8173*
>
> --
> *
> *
>
> <https://www.facebook.com/IndiaMART/videos/578196442936091/>