Odd Edge Case for SpellCheck

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Odd Edge Case for SpellCheck

Moyer, Brett
Hello, we have spellcheck running, using the index as the dictionary. An odd use case came up today wanted to get your thoughts and see if what we determined is correct. Use case: User sends a query for q=brokerage, spellcheck fires and returns "brokerage". Looking at the output I see that solr must have pulled the root word "brokage" then spellcheck said hey I need to fix that. Is that correct? There's no issue, it's just an unexpected outcome. Thanks!

"q":"brokerage",
"spellcheck":{
    "suggestions":
    [
      {"name":"brokage",{
        "type":"str","value":"numFound":1,
        "startOffset":0,
        "endOffset":9,
        "suggestion":["brokerage"]}}],
    "collations":
    [
      {"name":"collation","type":"str","value":"brokerage"}]}}

Brett Moyer
*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete it.

TIAA
*************************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Odd Edge Case for SpellCheck

Jörn Franke
Stemming involved ?

> Am 22.11.2019 um 14:23 schrieb Moyer, Brett <[hidden email]>:
>
> Hello, we have spellcheck running, using the index as the dictionary. An odd use case came up today wanted to get your thoughts and see if what we determined is correct. Use case: User sends a query for q=brokerage, spellcheck fires and returns "brokerage". Looking at the output I see that solr must have pulled the root word "brokage" then spellcheck said hey I need to fix that. Is that correct? There's no issue, it's just an unexpected outcome. Thanks!
>
> "q":"brokerage",
> "spellcheck":{
>    "suggestions":
>    [
>      {"name":"brokage",{
>        "type":"str","value":"numFound":1,
>        "startOffset":0,
>        "endOffset":9,
>        "suggestion":["brokerage"]}}],
>    "collations":
>    [
>      {"name":"collation","type":"str","value":"brokerage"}]}}
>
> Brett Moyer
> *************************************************************************
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender immediately and then delete it.
>
> TIAA
> *************************************************************************
Reply | Threaded
Open this post in threaded view
|

RE: Odd Edge Case for SpellCheck

Moyer, Brett
Yes we are stemming, ahh so we shouldn't stem our words to be spelled?

Brett Moyer

-----Original Message-----
From: Jörn Franke <[hidden email]>
Sent: Friday, November 22, 2019 8:34 AM
To: [hidden email]
Subject: Re: Odd Edge Case for SpellCheck

Stemming involved ?

> Am 22.11.2019 um 14:23 schrieb Moyer, Brett <[hidden email]>:
>
> Hello, we have spellcheck running, using the index as the dictionary. An odd use case came up today wanted to get your thoughts and see if what we determined is correct. Use case: User sends a query for q=brokerage, spellcheck fires and returns "brokerage". Looking at the output I see that solr must have pulled the root word "brokage" then spellcheck said hey I need to fix that. Is that correct? There's no issue, it's just an unexpected outcome. Thanks!
>
> "q":"brokerage",
> "spellcheck":{
>    "suggestions":
>    [
>      {"name":"brokage",{
>        "type":"str","value":"numFound":1,
>        "startOffset":0,
>        "endOffset":9,
>        "suggestion":["brokerage"]}}],
>    "collations":
>    [
>      {"name":"collation","type":"str","value":"brokerage"}]}}
>
> Brett Moyer
> **********************************************************************
> *** This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender immediately and then delete it.
>
> TIAA
> **********************************************************************
> ***
*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete it.

TIAA
*************************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Odd Edge Case for SpellCheck

Erick Erickson
If you’re using direct spell checking, it looks for the _indexed_ term. So this means you get stemmed corrections if you’re stemming etc. Usually you should use a copyField to a field with minimal analysis and use that field for spellchecking.

Another way to thing about it is that if you use the admin/analysis page for terms in a field, the terms in the dictionary are what’s at the end of the indexed side of the page.

Best,
Erick

> On Nov 25, 2019, at 4:02 PM, Moyer, Brett <[hidden email]> wrote:
>
> Yes we are stemming, ahh so we shouldn't stem our words to be spelled?
>
> Brett Moyer
>
> -----Original Message-----
> From: Jörn Franke <[hidden email]>
> Sent: Friday, November 22, 2019 8:34 AM
> To: [hidden email]
> Subject: Re: Odd Edge Case for SpellCheck
>
> Stemming involved ?
>
>> Am 22.11.2019 um 14:23 schrieb Moyer, Brett <[hidden email]>:
>>
>> Hello, we have spellcheck running, using the index as the dictionary. An odd use case came up today wanted to get your thoughts and see if what we determined is correct. Use case: User sends a query for q=brokerage, spellcheck fires and returns "brokerage". Looking at the output I see that solr must have pulled the root word "brokage" then spellcheck said hey I need to fix that. Is that correct? There's no issue, it's just an unexpected outcome. Thanks!
>>
>> "q":"brokerage",
>> "spellcheck":{
>>   "suggestions":
>>   [
>>     {"name":"brokage",{
>>       "type":"str","value":"numFound":1,
>>       "startOffset":0,
>>       "endOffset":9,
>>       "suggestion":["brokerage"]}}],
>>   "collations":
>>   [
>>     {"name":"collation","type":"str","value":"brokerage"}]}}
>>
>> Brett Moyer
>> **********************************************************************
>> *** This e-mail may contain confidential or privileged information.
>> If you are not the intended recipient, please notify the sender immediately and then delete it.
>>
>> TIAA
>> **********************************************************************
>> ***
> *************************************************************************
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender immediately and then delete it.
>
> TIAA
> *************************************************************************

Reply | Threaded
Open this post in threaded view
|

RE: Odd Edge Case for SpellCheck

Moyer, Brett
This is a great help, thank you!

Brett Moyer

-----Original Message-----
From: Erick Erickson <[hidden email]>
Sent: Monday, November 25, 2019 4:12 PM
To: [hidden email]
Subject: Re: Odd Edge Case for SpellCheck

If you’re using direct spell checking, it looks for the _indexed_ term. So this means you get stemmed corrections if you’re stemming etc. Usually you should use a copyField to a field with minimal analysis and use that field for spellchecking.

Another way to thing about it is that if you use the admin/analysis page for terms in a field, the terms in the dictionary are what’s at the end of the indexed side of the page.

Best,
Erick

> On Nov 25, 2019, at 4:02 PM, Moyer, Brett <[hidden email]> wrote:
>
> Yes we are stemming, ahh so we shouldn't stem our words to be spelled?
>
> Brett Moyer
>
> -----Original Message-----
> From: Jörn Franke <[hidden email]>
> Sent: Friday, November 22, 2019 8:34 AM
> To: [hidden email]
> Subject: Re: Odd Edge Case for SpellCheck
>
> Stemming involved ?
>
>> Am 22.11.2019 um 14:23 schrieb Moyer, Brett <[hidden email]>:
>>
>> Hello, we have spellcheck running, using the index as the dictionary. An odd use case came up today wanted to get your thoughts and see if what we determined is correct. Use case: User sends a query for q=brokerage, spellcheck fires and returns "brokerage". Looking at the output I see that solr must have pulled the root word "brokage" then spellcheck said hey I need to fix that. Is that correct? There's no issue, it's just an unexpected outcome. Thanks!
>>
>> "q":"brokerage",
>> "spellcheck":{
>>   "suggestions":
>>   [
>>     {"name":"brokage",{
>>       "type":"str","value":"numFound":1,
>>       "startOffset":0,
>>       "endOffset":9,
>>       "suggestion":["brokerage"]}}],
>>   "collations":
>>   [
>>     {"name":"collation","type":"str","value":"brokerage"}]}}
>>
>> Brett Moyer
>> *********************************************************************
>> *
>> *** This e-mail may contain confidential or privileged information.
>> If you are not the intended recipient, please notify the sender immediately and then delete it.
>>
>> TIAA
>> *********************************************************************
>> *
>> ***
> **********************************************************************
> *** This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender immediately and then delete it.
>
> TIAA
> **********************************************************************
> ***

*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete it.

TIAA
*************************************************************************