[jira] [Commented] (LUCENE-4845) Add AnalyzingInfixSuggester

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-4845) Add AnalyzingInfixSuggester

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13607539#comment-13607539 ]

Michael McCandless commented on LUCENE-4845:
--------------------------------------------

bq. I think its because your FreeDB has a lot more words than my place names?

I think so.  Song titles are longer than place names :)

bq. But really there must be a infixing limit for relevance reasons alone.

I think the app can decide this.

bq. Why is it so bad, but the edge-ngrams limit ok?

I don't think either limit is OK!  In the ideal world we wouldn't require such limits due to performance/RAM issues.

But no suggester is perfect, this is why we offer multiple options.  These two approaches have different tradeoffs...
               

> Add AnalyzingInfixSuggester
> ---------------------------
>
>                 Key: LUCENE-4845
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4845
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/spellchecker
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 5.0, 4.3
>
>         Attachments: infixSuggest.png, LUCENE-4845.patch, LUCENE-4845.patch, LUCENE-4845.patch
>
>
> Our current suggester impls do prefix matching of the incoming text
> against all compiled suggestions, but in some cases it's useful to
> allow infix matching.  E.g, Netflix does infix suggestions in their
> search box.
> I did a straightforward impl, just using a normal Lucene index, and
> using PostingsHighlighter to highlight matching tokens in the
> suggestions.
> I think this likely only works well when your suggestions have a
> strong prior ranking (weight input to build), eg Netflix knows
> the popularity of movies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]