[jira] Created: (LUCENE-760) Spellchecker could/should use n-gram tokenizers instead of rolling its own n-gramming

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (LUCENE-760) Spellchecker could/should use n-gram tokenizers instead of rolling its own n-gramming

JIRA jira@apache.org
Spellchecker could/should use n-gram tokenizers instead of rolling its own n-gramming
-------------------------------------------------------------------------------------

                 Key: LUCENE-760
                 URL: http://issues.apache.org/jira/browse/LUCENE-760
             Project: Lucene - Java
          Issue Type: Improvement
          Components: Analysis
            Reporter: Otis Gospodnetic
         Assigned To: Otis Gospodnetic
            Priority: Minor


The SpellChecker.java under contrib/spellchecker currently does its own creation of n-grams while it creates the index to search for alternative spelling suggestions, and then it again creates appropriate n-grams when it receives a query string/word to lookup alternative spelling suggestions for.  Very clear sentence, I know.

I think it might be better if n-gram chomping could be outsourced to n-gram tokenizers that just made their way into contrib/analyzers via LUCENE-759.

If I see nods or if I don't get any nays I'll go and refactor SpellChecker.java a little bit to allow this.
SpellChecker has a page on the Wiki: http://wiki.apache.org/jakarta-lucene/SpellChecker

Thoughts?


--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [jira] Created: (LUCENE-760) Spellchecker could/should use n-gram tokenizers instead of rolling its own n-gramming

Patrek
Here is a positive node!

Keep up the excellent work!

Patrick

On 12/22/06, Otis Gospodnetic (JIRA) <[hidden email]> wrote:

>
> Spellchecker could/should use n-gram tokenizers instead of rolling its own
> n-gramming
>
> -------------------------------------------------------------------------------------
>
>                  Key: LUCENE-760
>                  URL: http://issues.apache.org/jira/browse/LUCENE-760
>              Project: Lucene - Java
>           Issue Type: Improvement
>           Components: Analysis
>             Reporter: Otis Gospodnetic
>          Assigned To: Otis Gospodnetic
>             Priority: Minor
>
>
> The SpellChecker.java under contrib/spellchecker currently does its own
> creation of n-grams while it creates the index to search for alternative
> spelling suggestions, and then it again creates appropriate n-grams when it
> receives a query string/word to lookup alternative spelling suggestions
> for.  Very clear sentence, I know.
>
> I think it might be better if n-gram chomping could be outsourced to
> n-gram tokenizers that just made their way into contrib/analyzers via
> LUCENE-759.
>
> If I see nods or if I don't get any nays I'll go and refactor
> SpellChecker.java a little bit to allow this.
> SpellChecker has a page on the Wiki:
> http://wiki.apache.org/jakarta-lucene/SpellChecker
>
> Thoughts?
>
>
> --
> This message is automatically generated by JIRA.
> -
> If you think it was sent incorrectly contact one of the administrators:
> http://issues.apache.org/jira/secure/Administrators.jspa
> -
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

[jira] Closed: (LUCENE-760) Spellchecker could/should use n-gram tokenizers instead of rolling its own n-gramming

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Otis Gospodnetic closed LUCENE-760.
-----------------------------------

    Resolution: Won't Fix

> Spellchecker could/should use n-gram tokenizers instead of rolling its own n-gramming
> -------------------------------------------------------------------------------------
>
>                 Key: LUCENE-760
>                 URL: https://issues.apache.org/jira/browse/LUCENE-760
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>            Reporter: Otis Gospodnetic
>            Assignee: Otis Gospodnetic
>            Priority: Minor
>
> The SpellChecker.java under contrib/spellchecker currently does its own creation of n-grams while it creates the index to search for alternative spelling suggestions, and then it again creates appropriate n-grams when it receives a query string/word to lookup alternative spelling suggestions for.  Very clear sentence, I know.
> I think it might be better if n-gram chomping could be outsourced to n-gram tokenizers that just made their way into contrib/analyzers via LUCENE-759.
> If I see nods or if I don't get any nays I'll go and refactor SpellChecker.java a little bit to allow this.
> SpellChecker has a page on the Wiki: http://wiki.apache.org/jakarta-lucene/SpellChecker
> Thoughts?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]