[jira] Created: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

classic Classic list List threaded Threaded
39 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
------------------------------------------------------------------------------------------------------------------------------

                 Key: LUCENE-2110
                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
             Project: Lucene - Java
          Issue Type: Improvement
          Components: Search
    Affects Versions: Flex Branch
            Reporter: Uwe Schindler
             Fix For: Flex Branch


FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785645#action_12785645 ]

Uwe Schindler commented on LUCENE-2110:
---------------------------------------

I will work on this tomorrow and provide a patch. I will also update the patch in LUCENE-1606 to move the initial seek out of ctor (its easy, see below).

The setEnum method should be renamed in something like setInitialTermRef(). So the default impl of next() will seek to the correct term and do not seek by default (iterate all terms of field).

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>             Fix For: Flex Branch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
In reply to this post by Igor Motov (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2110:
----------------------------------

    Attachment: LUCENE-2110.patch

Here my patch for this.

I rewrote the whole FilteredTermsEnum and made it natively support seeking needed for NRQ and Automaton.

This initial patch is for review only, but all tests pass. I will try to modify Robert's patch, as soon as he provided me an updated Patch for Automaton flex branch.

The enum works different than before:
It is positioned before the first term (like it should), seeking is no longer supported (as not needed for MTQ) and not implementable for seeking enums like NRQ or Automaton.

In the constructor you give index reader and field name, as TermsEnum can only iterate one field in flex, this is no limitation.

For non-seeking enums you can set the initial term to seek to with setInitialSeekTerm(TermRef) in the ctor. The rest of the enum then behaves as before.

For seeking enums like Automaton/NRQ you override a secondary iterator method nextSeekTerm() that returns the next TermRef the underlying iterator should seek to. This method is called, when accept() returns END (and also on the first next() call, of course). The default impl of this method just returns the initial seek term as explained above one time and then null.

Everything else stands in the javadocs.

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Assigned: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
In reply to this post by Igor Motov (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler reassigned LUCENE-2110:
-------------------------------------

    Assignee: Uwe Schindler

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
In reply to this post by Igor Motov (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785990#action_12785990 ]

Robert Muir commented on LUCENE-2110:
-------------------------------------

Uwe, I will look at re-porting automaton to flex so you can test this. (now it has good tests and sort order/unicode crap is fixed and they should all pass).


> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
In reply to this post by Igor Motov (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785996#action_12785996 ]

Michael McCandless commented on LUCENE-2110:
--------------------------------------------

This is a great improvement Uwe... I like it.

Is an MTQ allowed to return nextSeekTerm's out of order?  (I know NRQ/automaton don't need to do so, but, if it's fine we should maybe call that out in the javadocs...).  Though, FilteredTermsEnum, being a "TermsEnum", is "supposed" to return terms in getTermComparator() order... however its consumers (the rewrite methods for MTQ) usually don't in fact care.  Hmm I wonder if it should even subclass TermsEnum?  It doesn't seek and it's free to return terms in a different order...

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
In reply to this post by Igor Motov (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2110:
----------------------------------

    Attachment: LUCENE-2110.patch

Updated patch also incorporating the needed changes for SingleTermsEnum to make it work with new API. Now it is at least a 5-code-liner :-)

I also fixed a method call instead of parameter usage in TermRangeTermsEnum. Also added Mike's comment In my opinion, we should keep it as TermsEnum, even when seeking does not work, which is documented. In my code I often use PrefixTerm(s)Enum for autocomplete cases - works good - and for that it is only handles as a Term(s)Enum for iterating making it simplier to reuse code working on Term(s)Enums. Also made some mebers final, I forgot this during restructuring the code.

What I forgot to mention: I made the abstract methods in FilteredTermsEnum also throw IOException, so maybe subclasses, doing strange things, would compile.

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
In reply to this post by Igor Motov (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786072#action_12786072 ]

Michael McCandless commented on LUCENE-2110:
--------------------------------------------

bq. In my opinion, we should keep it as TermsEnum, even when seeking does not work, which is documented

OK, let's keep it as subclassing TermsEnum.  Maybe we should relax the docs for TermsEnum to state that each subclass determines order.  Nothing in TermsEnum itself requires a particular order.

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
In reply to this post by Igor Motov (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2110:
----------------------------------

    Attachment: LUCENE-2110.patch

New patch with the attribute support of LUCENE-2109.

- Also fixes a bug in the BW compatibility layer of MTQ (if clause wrong).
- Some code cleanup in FilteredTermsEnum (now easier to read, as next() and seekNextTerm is complicated).
- Added EmptyTermsEnum for shortcuts (used by NRQ and TRQ on inverse ranges). This enum never does any disk I/O to terms dict, it is just empty. EmptyTermsEnum again supports seeking (although subclass of FilteredTermsEnum), but it is simple there, it returns just END :-)

I will now port Automaton and apply will provide a combined patch there.

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
In reply to this post by Igor Motov (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786411#action_12786411 ]

Mark Miller commented on LUCENE-2110:
-------------------------------------

Hey Uwe, since your editing this code anyway, wanna add a comment fix for the ref of TermInfo here?

{code}
+          // Loading the TermInfo from the terms dict here
+          // should not be costly, because 1) the
+          // query/filter will load the TermInfo when it
+          // runs, and 2) the terms dict has a cache:
{code}

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
In reply to this post by Igor Motov (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2110:
----------------------------------

    Attachment: LUCENE-2110.patch

After porting Automaton, I realized, that the seeking code should be changed and made a little bit more flexible.

AcceptStatus can now return 5 stati:
- YES, NO: Accept / not accept the term and go forward, the simple linear case that iterates until the end and filters terms (FuzzyQuery case, linear Automaton)
- YES_AND_SEEK, NO_AND_SEEK: the same like above, but instead of simply going forward, nextSeekTerm() is called to retrieve a new term to seek to. This method is now supposed to always return a greater term than before, if not, the enumeration can end too early (see below).
- END: end the enumeration, so seeking. This status is used by TermRangeQuery and PrefixQuery as before.

nextSeekTerm() should always return a greater term that the last one before seeking. This is asserted by NRQ. It is not bad to do this, but after that the enum is no longer correctly sorted. Also, if the consumer reaches the last term of the underlying enum, call next() will end enumeration and so further terms in the nextSeekTerm() interation will not consulted (the same happens when END is returned in accept, of course).

If nextSeekTerm() returns null, the enumeration is also ended, so it is not required to return AcceptStatus.END instead of X_AND_SEEK.

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
In reply to this post by Igor Motov (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786426#action_12786426 ]

Uwe Schindler commented on LUCENE-2110:
---------------------------------------

Mark: I do not know about what you are talking about (sorry, my brain is fuming after automaton).

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
In reply to this post by Igor Motov (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786433#action_12786433 ]

Mark Miller commented on LUCENE-2110:
-------------------------------------

No problem, we can get it after - its not really related, just figured since you were patching here anyway and I happened to notice it will taking a look at the patch:

TermInfo is no longer used in flex, but its referenced in the above comment, in MTQ.



> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
In reply to this post by Igor Motov (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786439#action_12786439 ]

Robert Muir commented on LUCENE-2110:
-------------------------------------

Uwe, I really like what you have done here (as commented on LUCENE-1606)

Seeking around in a filteredtermsenum is even simpler here. (in my opinion, this thing is very tricky with trunk and it is good to simplify)


> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
In reply to this post by Igor Motov (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786451#action_12786451 ]

Michael McCandless commented on LUCENE-2110:
--------------------------------------------

bq. nextSeekTerm() should always return a greater term that the last one before seeking.

Uwe, why was this constraint needed?  What goes wrong if we allow terms to be returned out of order?  The consumers of this (MTQ's rewrite methods) don't mind if terms are out of order, right?

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
In reply to this post by Igor Motov (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786452#action_12786452 ]

Uwe Schindler commented on LUCENE-2110:
---------------------------------------

It will work (theoretically) but can fail:
if you seek to the last term and accept it, the next call to next() will end the enum, even if there may be more positions to seek. You cannot rely on the fact that all seek terms are visited. Because of that it *should* be foreward only, if other, you must know what you do

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
In reply to this post by Igor Motov (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786457#action_12786457 ]

Uwe Schindler commented on LUCENE-2110:
---------------------------------------

I have a solution for this problem: If the end of the enum is reached i just asks for a new term is seek==true (that is what iwas before). But nextPrefixTerm() gets the information that the end was already finished and *could* return null then. This is important for automaton, because it would loop endless else (because it would produce terms and terms and terms... in nextSeekTerm).

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
In reply to this post by Igor Motov (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2110:
----------------------------------

    Attachment: LUCENE-2110.patch

Attached is patch that allows the TermsEnum to go backwards and not break if end of underlying TermsEnum is reached after next() or seek().

The method nextSeekTerm() gets a boolean if the underlying TermsEnum is exhausted. Enums that work in order can the simply return null to break iteration. But they are free to reposition to a term before.

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
In reply to this post by Igor Motov (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2110:
----------------------------------

    Attachment:     (was: LUCENE-2110.patch)

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

Igor Motov (Jira)
In reply to this post by Igor Motov (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2110:
----------------------------------

    Attachment: LUCENE-2110.patch

fixed patch - i have to stop for today.

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

12