Spell check with data from database and not from english dictionary

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Spell check with data from database and not from english dictionary

seeteshh
Hello all,

Can the spell check feature be configured with words/data fetched from a
database and not from the English dictionary?

Regards,

Seetesh Hindlekar



-----
Seetesh Hindlekar
--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Seetesh Hindlekar
Reply | Threaded
Open this post in threaded view
|

Re: Spell check with data from database and not from english dictionary

Alessandro Benedetti
Hi Seetesh,
As you can see from the wiki [1] there are mainly two input sources for a
spellcheck dictionary:
1) a file
2) the index (in a couple of different forms)

If you prefer the file approach, it's your call to produce the file and you
can certainly use whatever you like to fill the data.
It could be from the English dictionary or from a database.


[1] https://lucene.apache.org/solr/guide/8_4/spell-checking.html
--------------------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
www.sease.io


On Thu, 23 Jan 2020 at 06:06, seeteshh <[hidden email]> wrote:

> Hello all,
>
> Can the spell check feature be configured with words/data fetched from a
> database and not from the English dictionary?
>
> Regards,
>
> Seetesh Hindlekar
>
>
>
> -----
> Seetesh Hindlekar
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
Reply | Threaded
Open this post in threaded view
|

Re: Spell check with data from database and not from english dictionary

seeteshh
Hello Alessandra

Thanks for your post.

Thats what I am concerned about generating a file or reindexing as the data
in the database will keep on changing (adding or updating).

Can you share any links where it is configurable on Solr 8.4?

Regards,

Seetesh Hindlekar



-----
Seetesh Hindlekar
--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Seetesh Hindlekar
Reply | Threaded
Open this post in threaded view
|

Re: Spell check with data from database and not from english dictionary

Jan Høydahl / Cominvent
You could create a job in your application that generates a new spellcheck dictionary from DB every day and upload it to Zookeeper, then calling RELOAD on the collection(s). Or you could start investigating a ManagedSpellchecker implementation of the spellcheck component that exposes a REST API to add to the dictionary. No need for SQL complexity for this inside Solr.

Jan

> 27. jan. 2020 kl. 10:49 skrev seeteshh <[hidden email]>:
>
> Hello Alessandra
>
> Thanks for your post.
>
> Thats what I am concerned about generating a file or reindexing as the data
> in the database will keep on changing (adding or updating).
>
> Can you share any links where it is configurable on Solr 8.4?
>
> Regards,
>
> Seetesh Hindlekar
>
>
>
> -----
> Seetesh Hindlekar
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Reply | Threaded
Open this post in threaded view
|

Re: Spell check with data from database and not from english dictionary

seeteshh
Hello Jan

Let me work on your suggestions too.

Also I had one query

While working on the spell check component, I dont any suggestion for the
incorrect word typed

example : In spellcheck.q,   I type "Teh" instead of "The" or "saa" instead
of "sea"

  "responseHeader":{
    "status":0,
    "QTime":0,
    "params":{
      "spellcheck.q":"Teh",
      "spellcheck":"on",
      "spellcheck.reload":"true",
      "spellcheck.build":"true",
      "_":"1580287370193",
      "spellcheck.collate":"true"}},
  "command":"build",
  "response":{"numFound":0,"start":0,"docs":[]
  },
  "spellcheck":{
    "suggestions":[],
    "collations":[]}}

I have to create an entry in the synonyms.txt file for teh => The to make up
for this issue.

Does Solr require a 4 digit character in spellcheck.q to provide the proper
suggestion for the mis-spelt word? Is there any section in the Reference
guide  where it is documented? These are my findings/observations but need
to know the rationale behind this.

Regards,

Seetesh Hindlekar





-----
Seetesh Hindlekar
--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Seetesh Hindlekar