[jira] [Resolved] (NUTCH-2209) Improved Tokenization for Similarity Scoring plugin

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Resolved] (NUTCH-2209) Improved Tokenization for Similarity Scoring plugin

JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/NUTCH-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sebastian Nagel resolved NUTCH-2209.
------------------------------------
    Resolution: Done

This has been already committed (pull request merged) for Nutch 1.12.

> Improved Tokenization for Similarity Scoring plugin
> ---------------------------------------------------
>
>                 Key: NUTCH-2209
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2209
>             Project: Nutch
>          Issue Type: Improvement
>          Components: scoring
>            Reporter: Sujen Shah
>            Assignee: Sujen Shah
>            Priority: Major
>              Labels: memex
>
> This patch would add Lucene based tokenization to the cosine similarity plugin and clean up the code currently present.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)