Spellchecker -File based vs Index based

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Spellchecker -File based vs Index based

Ashish Bisht
Hi,

I am seeing difference in file based spellcheck and index based spellcheck
implementations.

Using index based
http://<Box>:8983/solr/SCSpell/spell?q=*intnet of
things*&defType=edismax&qf=spellcontent&wt=json&rows=0&spellcheck=true&spellcheck.dictionary=*default*&q.op=AND


  "suggestions":[
      "intnet",{
        "numFound":10,
        "startOffset":0,
        "endOffset":6,
        "origFreq


Suggestion get build up only for wrong word.


But while suing file based,they get build up for right words too which
messes collations

http://<Box>:8983/solr/SCSpell/spell?q=intnet%20of%20things&defType=edismax&qf=spellcontent&wt=json&rows=0&&spellcheck=true&spellcheck.dictionary=*file*&q.op=AND

 "suggestion":["*internet*",
          "contnet",
          "intel",
          "intent",
          "intert",
          "intelect",
          "intended",
          "intented",
          "interest",
          "botnets"]},
      "*of*",{
        "numFound":8,
        "startOffset":7,
        "endOffset":9,
        "suggestion":["ofc",
          "off",
          "ohf",
         .....
          "soft"]},
 "*things*",{
        "numFound":10,
        "startOffset":10,
        "endOffset":16,
        "suggestion":["thing",
          "brings",
          "think",
          "thinkers",
          .....



Is there any property in file based which I use to fix this



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Spellchecker -File based vs Index based

Erick Erickson
Two guesses:
1> you have something different in your spellcheck config .vs. index config.
2> you don’t have the word in your file for the file-based spellcheck, thus
     Solr has no way of knowing the word is correctly spelled.

> On Mar 17, 2019, at 11:56 PM, Ashish Bisht <[hidden email]> wrote:
>
> Hi,
>
> I am seeing difference in file based spellcheck and index based spellcheck
> implementations.
>
> Using index based
> http://<Box>:8983/solr/SCSpell/spell?q=*intnet of
> things*&defType=edismax&qf=spellcontent&wt=json&rows=0&spellcheck=true&spellcheck.dictionary=*default*&q.op=AND
>
>
>  "suggestions":[
>      "intnet",{
>        "numFound":10,
>        "startOffset":0,
>        "endOffset":6,
>        "origFreq
>
>
> Suggestion get build up only for wrong word.
>
>
> But while suing file based,they get build up for right words too which
> messes collations
>
> http://<Box>:8983/solr/SCSpell/spell?q=intnet%20of%20things&defType=edismax&qf=spellcontent&wt=json&rows=0&&spellcheck=true&spellcheck.dictionary=*file*&q.op=AND
>
> "suggestion":["*internet*",
>          "contnet",
>          "intel",
>          "intent",
>          "intert",
>          "intelect",
>          "intended",
>          "intented",
>          "interest",
>          "botnets"]},
>      "*of*",{
>        "numFound":8,
>        "startOffset":7,
>        "endOffset":9,
>        "suggestion":["ofc",
>          "off",
>          "ohf",
>         .....
>          "soft"]},
> "*things*",{
>        "numFound":10,
>        "startOffset":10,
>        "endOffset":16,
>        "suggestion":["thing",
>          "brings",
>          "think",
>          "thinkers",
>          .....
>
>
>
> Is there any property in file based which I use to fix this
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Reply | Threaded
Open this post in threaded view
|

Re: Spellchecker -File based vs Index based

Ashish Bisht
Spellcheck configuration is default one..

<lst name="spellchecker">
    <str name="classname">solr.FileBasedSpellChecker</str>
    <str name="name">file</str>
    <str name="sourceLocation">spellings.txt</str>
    <str name="characterEncoding">UTF-8</str>
    <str name="spellcheckIndexDir">./spellcheckerFile</str>
</lst>


<requestHandler name="/spell" class="solr.SearchHandler" startup="lazy">
      <str name="spellcheck.dictionary">default</str>
      <str name="spellcheck.dictionary">jkdefault</str>
      <str name="spellcheck.dictionary">file</str>
      <str name="spellcheck">on</str>
      <str name="spellcheck.extendedResults">true</str>
      <str name="spellcheck.count">10</str>
      <str name="spellcheck.alternativeTermCount">5</str>
      <str name="spellcheck.maxResultsForSuggest">5</str>
      <str name="spellcheck.collate">true</str>
      <str name="spellcheck.maxCollations">10</str>
      <str name="spellcheck.collateExtendedResults">true</str>
      <str name="spellcheck.maxCollationTries">10</str>
      <str name="spellcheck.maxCollations">5</str>
    </lst>

Also the words are present in the file..For e.g things word which is
corrected is present inside file.Also the suggestions related to it are
present.

*I don't want suggestions for right word (of,things)..Any problem with
request .Tried two combinations.*

1./spell?spellcheck.q=intnet of
things&spellcheck=true&spellcheck.collateParam.q.op=AND&df=spellcontent&spellcheck.dictionary=file

2./spell?q=intnet of
things&defType=edismax&qf=spellcontent&wt=json&rows=0&&spellcheck=true&spellcheck.dictionary=file&q.op=AND

Please suggest



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html