[jira] [Commented] (LUCENE-4656) Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream (without CharTermAttribute), fix BaseTokenStreamTestCase

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-4656) Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream (without CharTermAttribute), fix BaseTokenStreamTestCase

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543125#comment-13543125 ]

Robert Muir commented on LUCENE-4656:
-------------------------------------

I would also say that we dont need EmptyTokenizer in test-framework.
Its only there because 2 places use it, and both in a bogus way (in my opinion):
1. core/TestDocument
2. queryparsers

we should first fix TestDocument, its test does not care if the tokenstream is empty or anything:
{noformat}
Index: src/test/org/apache/lucene/document/TestDocument.java
===================================================================
--- src/test/org/apache/lucene/document/TestDocument.java (revision 1428441)
+++ src/test/org/apache/lucene/document/TestDocument.java (working copy)
@@ -20,7 +20,7 @@
 import java.io.StringReader;
 import java.util.List;
 
-import org.apache.lucene.analysis.EmptyTokenizer;
+import org.apache.lucene.analysis.CannedTokenStream;
 import org.apache.lucene.analysis.MockAnalyzer;
 import org.apache.lucene.index.DirectoryReader;
 import org.apache.lucene.index.IndexReader;
@@ -318,7 +318,7 @@
   // LUCENE-3616
   public void testInvalidFields() {
     try {
-      new Field("foo", new EmptyTokenizer(new StringReader("")), StringField.TYPE_STORED);
+      new Field("foo", new CannedTokenStream(), StringField.TYPE_STORED);
       fail("did not hit expected exc");
     } catch (IllegalArgumentException iae) {
       // expected
{noformat}

The queryparser test looks outdated, like its some test about when an Analyzer returns null?
Maybe the test can just be removed, but if we apply this patch, we could move EmptyTokenizer
from test-framework/src/java to queryparser/src/test at least as an improvement, since it is kinda funky.

               

> Fix IndexWriter working together with EmptyTokenizer and EmptyTokenStream (without CharTermAttribute), fix BaseTokenStreamTestCase
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-4656
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4656
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/analysis
>            Reporter: Adrien Grand
>            Assignee: Uwe Schindler
>            Priority: Trivial
>         Attachments: LUCENE-4656_bttc.patch, LUCENE-4656-IW-bug.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch
>
>
> TestRandomChains can fail because EmptyTokenizer doesn't have a CharTermAttribute and doesn't compute the end offset (if the offset attribute was added by a filter).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]