[jira] [Commented] (LUCENE-4656) Fix EmptyTokenizer

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-4656) Fix EmptyTokenizer

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543017#comment-13543017 ]

Uwe Schindler commented on LUCENE-4656:
---------------------------------------

The problem is here that the attributes are initialized after construction of the Tokenizer before the consumer starts to consume the tokens. The bug in IndexWriter is that it fails, when the initial getAttribute fails. Maybe it should just initialize the bytesRef attribute to be NULL and fail later if really tokens are emitted.

Lucene 3.x indexed empty terms.
               

> Fix EmptyTokenizer
> ------------------
>
>                 Key: LUCENE-4656
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4656
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/analysis
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Trivial
>         Attachments: LUCENE-4656.patch, LUCENE-4656.patch
>
>
> TestRandomChains can fail because EmptyTokenizer doesn't have a CharTermAttribute and doesn't compute the end offset (if the offset attribute was added by a filter).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]