[jira] Created: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

classic Classic list List threaded Threaded
39 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

ASF GitHub Bot (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786480#action_12786480 ]

Uwe Schindler commented on LUCENE-2110:
---------------------------------------

Stop everything I get a collaps!!!!! Again wrong patch.

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2110:
----------------------------------

    Attachment:     (was: LUCENE-2110.patch)

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2110:
----------------------------------

    Attachment: LUCENE-2110.patch

Now the final one.

I somehow need a test enum which does very strange things like seeking forward and backwards and returning all strange stati.

Will think about one tomorrow.

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786486#action_12786486 ]

Uwe Schindler commented on LUCENE-2110:
---------------------------------------

Robert and me analyzed the latest change. It is so complex and I am really not sure, if we should do this. It is impossible to maintain this.

We should enforce only seeking forwards (even if MTQ could accept terms out of order). Violating TermsEnums order stupid, so we should use the patch before. NRQ and also Automaton enforce stepping forwards only.

Mike?

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786487#action_12786487 ]

Robert Muir commented on LUCENE-2110:
-------------------------------------

yeah compared to the last patch, the backwards seeking makes the code more complex in my opinion.

i do not understand why a MTQ would need to backwards seek? Can we say instead if you want to do such a thing with flexible indexing, that the way is to instead define custom sort order in your codec?



> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786494#action_12786494 ]

Uwe Schindler commented on LUCENE-2110:
---------------------------------------

+1, I reverted it here completely. It is not senseful to support unordered filtered enums. If somebody wants to implement that he should do it otherwise by overriding next() himself and not use nextSeekTerm() and accept().

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2110:
----------------------------------

    Attachment:     (was: LUCENE-2110.patch)

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786587#action_12786587 ]

Michael McCandless commented on LUCENE-2110:
--------------------------------------------

bq. We should enforce only seeking forwards (even if MTQ could accept terms out of order). Violating TermsEnums order stupid, so we should use the patch before. NRQ and also Automaton enforce stepping forwards only.

OK, I hadn't realized this was adding so much complexity to the code, so I agree let's revert it and require FTE to always move forward according to getTermComparator its actualEnum.  We don't have a need for this today, anyway.  Design for today ;)

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786610#action_12786610 ]

Uwe Schindler commented on LUCENE-2110:
---------------------------------------

From our IRC chat between Mike and me:
{quote}
[10:19] ThetaPhi_: mike: any further comments for 2110?
[10:19] mikemccand: am just trying to look at it -- waiting on a sloooooow svn up.
[10:19] ThetaPhi_: this out of order is not trivial and makes code for automaton and NRQ more complicated
[10:19] mikemccand: the change is a great step forward
[10:19] mikemccand: yeah forget it
[10:19] ThetaPhi_: yes
[10:19] ThetaPhi_: you understood the problem
[10:19] mikemccand: i had thought it was nearly free
[10:19] mikemccand: i don't fully understand the problem
[10:19] mikemccand: impossible to keep up with you!!
[10:20] ThetaPhi_: when the underlying enum is exhausted (I was exhausted yesterday, too), it stops processing, even if there are more nextSeelkTerms
[10:20] mikemccand: yes I saw your exhausted pun ;)
[10:20] mikemccand: wait -- if a TermsEnum becomes exhaused, you're unable to seek it again?  is that the issue?
[10:20] ThetaPhi_: waht you could do is, that after the underlying enum is exhausted, that you call nextSeekTerm and reposition
[10:21] ThetaPhi_: yes
[10:21] ThetaPhi_: you could seek it
[10:21] ThetaPhi_: and this worked at the beginning (first patch without the additional enum constants)
[10:21] ThetaPhi_: what it did was: when the underlying enum was exhausted it simply called nextSeekTerm and seeked
[10:22] mikemccand: so TermsEnum API allows seeks after next()'ing to exhaustion, right?  (I think it should but maybe it' doesn't somewhere?)
[10:22] ThetaPhi_: which was ok
[10:22] ThetaPhi_: mikemccand: your enums works perfect
[10:22] ThetaPhi_: they can be seeked everytime even when exhausted
[10:22] mikemccand: ok
[10:23] ThetaPhi_: the problem is logic in filteredtermsenum
[10:23] mikemccand: ok
[10:23] ThetaPhi_: it gets very complicated when you want to support two things:
[10:23] ThetaPhi_: a) seeking on request (NO_AND_SEEK, YES_AND_SEEK)
[10:24] ThetaPhi_: b) and want to consume all nextSeekTerms()
[10:24] ThetaPhi_: if somebody seeks after _AND_SEEK to e.g. the last term of the enum
[10:24] ThetaPhi_: in the current impl it will read that term
[10:25] ThetaPhi_: call termCompare and so on
[10:25] ThetaPhi_: if this accepts the term or not accepts the term
[10:25] ThetaPhi_: it will call next() on the underlying enum again
[10:25] ThetaPhi_: which then returns null
[10:26] ThetaPhi_: with the current patch it will then also end the filteredtermsenum
[10:26] mikemccand: ok -- so the termination logic inside FTE.next() got really hairy if on getting null from the actualEnum you had to consider going back for another seek term?
[10:26] ThetaPhi_: yes
[10:26] ThetaPhi_: i tried it yesterday
[10:26] mikemccand: we certainly don't need to support thist oday
[10:26] ThetaPhi_: and I had no good test
[10:27] ThetaPhi_: yes exactly
[10:27] mikemccand: if somehow this ever becomes needed then we add it then
[10:27] mikemccand: the new FTE looks great
[10:27] ThetaPhi_: the idea was to provide a param to nextSeekTerm which denotes if the underlying enum ended
[10:27] mikemccand: NRQ is soooo much simpler :)
[10:28] ThetaPhi_: the idea was to call even on exit nextSeekTerm(true), where triue meant exhausted
[10:28] mikemccand: i see -- so it involved different API
[10:28] ThetaPhi_: TEnums like NRQ or automaton only going forward then know, not to provide seek terms again
[10:28] ThetaPhi_: this was important for Automaton not to fall into endless loops
[10:29] mikemccand: this is hairy stuff :)
[10:29] ThetaPhi_: and good for NRQ to not provide any further terms
[10:30] ThetaPhi_: when I tried to implement there was always a problem and you were not able to correctly define waht should happen then
[10:30] mikemccand: how come the switch in FTE.next doesn't have "case NO" on the return from accept(term)?
[10:31] ThetaPhi_: thats not needed it just falls through and loops to next term
[10:31] ThetaPhi_: you could add it as NO: break;
[10:31] * mikemccand ahh got it -- maybe just add comment saying so
[10:31] ThetaPhi_: good idea, will do
[10:32] ThetaPhi_: the NO_AND_seek case is the same it just says doSeek = true
[10:32] ThetaPhi_: and for yes it returns the term in both cases, but records doSeek for the next time next() is called
[10:32] mikemccand: i wonder if we could simply add a seek term to AcceptStatus?
[10:33] mikemccand: vs calling the separate nextSeekTerm method
[10:33] ThetaPhi_: enum's contents are final
[10:33] mikemccand: not sure it'd be better... just wondering
[10:33] ThetaPhi_: because each constant is a singleton, they should be unmodifiable
[10:34] mikemccand: ie, fix AcceptStatus to be like TermsEnum.SeekResult
[10:34] mikemccand: so it returns status, but also an optional seekTerm which if null means no seeking
[10:34] ThetaPhi_: and how to do this, not with an enum constant
[10:34] ThetaPhi_: because they are constants
[10:35] ThetaPhi_: and createing a new object on each accept call seems not ideal
[10:35] mikemccand: sorry, i was confused -- SeekStatus used to be a class that had two attrs -- a TermRef, and the enum; i simplified that a while ago
[10:36] ThetaPhi_: aaaah
[10:36] ThetaPhi_: now its an enum
[10:36] ThetaPhi_: i do not think its a ood idea to return new objects
[10:36] mikemccand: right.  for a while I had no TermsEnum.term()
[10:36] mikemccand: the enums would reuse their status object
[10:36] mikemccand: ie API would require that this is fine
[10:37] ThetaPhi_: but enums are singleton
[10:37] ThetaPhi_: for the whole JVM
[10:37] mikemccand: if we did that... then we could go back to YES/NO/END, and, and seekTerm is then orthogonal
[10:37] ThetaPhi_: ok
[10:37] ThetaPhi_: so a protected setSeekTerm
[10:38] mikemccand: we'd make a new "AcceptResult" class.  has "AcceptStatus status" member, and also "TermRef seekTerm"
[10:38] ThetaPhi_: then we are back at the state before, we should ask robert
[10:38] mikemccand: ok
[10:38] ThetaPhi_: for him the separation of accept and nextSeekTerm was good
[10:38] mikemccand: ahh i see
[10:39] ThetaPhi_: but he could do the calculation of the next sring in accept, too
[10:39] mikemccand: though, can't he simply emulate?  ie, internally call a private nextSeekTerm, stick it on the returned AcceptResult, and return that?
[10:39] mikemccand: right
[10:39] ThetaPhi_: telephone...
[10:39] mikemccand: do we ever call nextSeekTerm, unless AcceptStatus was XXX_AND_SEEK?
[10:39] mikemccand: hah real world interrupts
[10:40] ThetaPhi_: thats what is done currentl
[10:40] ThetaPhi_: brb
[10:41] mikemccand: man all of infra seems slow right now... jira giving me intermittent Internal Server Error... svn really slow
[10:41] mikemccand: git is tempting
[11:04] *** jwtoddii has joined #lucene.
[11:15] *** jwtoddii has signed off IRC ().
[11:15] ThetaPhi_: re
[11:15] mikemccand: hello
[11:15] ThetaPhi_: for me infra works normal
[11:16] mikemccand: hmmm
[11:16] ThetaPhi_: but maybe because i am on the euorpe svn
[11:16] mikemccand: you use the eu mirrors?
[11:16] mikemccand: ok
[11:16] ThetaPhi_: but for jira i do not know
[11:16] mikemccand: jira is on/off for me now
[11:16] ThetaPhi_: do not think there is a mirror
[11:17] ThetaPhi_: [10:39] mikemccand: do we ever call nextSeekTerm, unless AcceptStatus was XXX_AND_SEEK?
[11:17] ThetaPhi_: no
[11:18] mikemccand: ok
[11:18] ThetaPhi_: and doSeek is always reset to false whenever a seek occured
[11:19] ThetaPhi_: the global doSeek is just for the YES_AND_SEEK case
[11:19] ThetaPhi_: because it exits the loop and returns the term
[11:19] ThetaPhi_: so on the next call to next() doSeek is true and we seek
[11:20] mikemccand: got it
[11:21] ThetaPhi_: an idea about the nextSeekTerm:
[11:21] ThetaPhi_: we add a protected setSeekTerm(TermRef)
[11:21] ThetaPhi_: this sets a private var
[11:22] ThetaPhi_: accept can set this and return YES or no wahtever
[11:22] ThetaPhi_: in the case of END it is ignored
[11:22] ThetaPhi_: but when next() is then called again, if this internal seek varaible is != null it seeks, else it goes forward with next
[11:22] ThetaPhi_: and after seek it is set to null
[11:23] ThetaPhi_: and we could add an assert inside this method to check if only seeking forward
[11:24] ThetaPhi_: to inform implementors of errors
[11:24] ThetaPhi_: assert termComp.compare(seekTerm, term()) > 0;
[11:24] mikemccand: why not simply return the optional seek term along w/ AcceptStatus?  it'd require no additional methods, and makes it clear that seeking part of accepting.
[11:25] mikemccand: ie the existince of new methods in the api (setSeekTerm, nextSeekTerm) enrich the api -- make you wonder when you can call them.  eg can ctor call setSeekTerm?
[11:25] ThetaPhi_: because accept status is an enum constant which is final, singleton and
[11:25] ThetaPhi_: yes ctor can do it at the moment
[11:25] ThetaPhi_: its already there its only called setinitialseekterm
[11:26] ThetaPhi_: (see docs)
[11:26] mikemccand: i mean make a new class (AcceptResult), like i described.  it contains an AcceptResult.Status (the enum), and a TermRef.
[11:26] ThetaPhi_: the problem is lots of object creation
[11:26] mikemccand: there would be no object creation -- we'd reuse?
[11:26] ThetaPhi_: in my opinion setSeekTerm is simplier to use
[11:27] ThetaPhi_: it would be the same like the reuse case
[11:27] ThetaPhi_: and setseekterm can also be used in ctor
[11:27] mikemccand: yeah the fact that ctor needs to set initial term does seem to require the extra method
[11:28] mikemccand: though i also don't like side-effect methods -- you change the internal state of the class, vs returning the seek term
[11:28] mikemccand: ie it makes the api stateful, which except for the initial case, is overkill
[11:29] ThetaPhi_: its an iterator it always has an internal state
[11:29] ThetaPhi_: i understand your problem
[11:29] ThetaPhi_: you call in a method that should only accept something a method that changes state
[11:30] mikemccand: the internal state is handled by FTE; my subclass is mostly stateless, if i return the seek term
[11:30] ThetaPhi_: so you cannot call it from outside (which would not wrok, because it is protected)
[11:30] ThetaPhi_: NRQ also has a state and also automaton
[11:31] ThetaPhi_: in NRQ it is the linkedlist with seek terms
[11:31] ThetaPhi_: in Automaton the nextString()
[11:31] mikemccand: right, they have their own state because of how they iterate.  it's just that seek term need not be stateful.
[11:32] ThetaPhi_: i think we should ask robert, at the moment i would know how to change his code to support both cases
[11:33] mikemccand: in fact, instead of setInitialSeekTerm, could we have getInitialSeekTerm?  ie FTE would invoke that once on start
[11:33] mikemccand: ok :)
[11:33] mikemccand: automaton is scary
[11:33] mikemccand: and, powerful
[11:33] ThetaPhi_: the getInitialSeek term in the current api is hidden behind the iterator
[11:34] ThetaPhi_: setInitialSeekTerm for the ctor is just a convenience
[11:34] ThetaPhi_: to prevent subclasses like prefixtermsenum from overrideing and implementing the singleton itertor
[11:34] * mikemccand w/ getInitialSeekTerm, there is no sneaky shared state w/ FTE.  i mean, FTE keeps track of its state, and each subclass tracks its own state.  subclassing wouldn't even be necessary anymore -- one could provdie a standonalone TermsEnumFilter, that just has getInitialSeekTerm and accept
[11:34] ThetaPhi_: (see docs)
[11:36] mikemccand: in general I don't like subclassing APIs -- it's a bigger surface area (how child works w/ parent, stateful methods, when can i call each method, etc.)
[11:36] ThetaPhi_: we could remove that also in the current api
[11:37] ThetaPhi_: only abstract classes should be subclassable
[11:37] ThetaPhi_: thats asked very often
[11:37] mikemccand: gonna be a cold run this AM -- 28F out there
[11:37] ThetaPhi_: *calculating*
[11:37] mikemccand: :)
[11:37] mikemccand: invert 32 + 9/5 * C
[11:38] ThetaPhi_: or ask google
[11:38] ThetaPhi_: -2.22 °C
[11:39] ThetaPhi_: (Link: http://www.google.de/search?hl=de&safe=off&q=28+fahrenheit+in+celsius&meta=&aq=f&oq=)http://www.google.de/search?hl=de&safe=off&q=28+fahrenheit+in+celsius&meta=&aq=f&oq=
[11:39] mikemccand: if i jump in and see that, to use FTE i only have to implement to an interface (TermsEnumFilter) that has getInitialTerm/accept... I think that's more approachable than figuring out the relationship  (methods, state) to an abstract parent class
[11:39] mikemccand: nice
[11:40] mikemccand: ugh, zillions of conflicts on backporting thread safe spellchecker
[11:40] ThetaPhi_: if the abstract parent class makes all other methods final its the same
[11:41] ThetaPhi_: but ok, you could change MTQ to just return the interface
[11:41] mikemccand: not really the same -- you have to define when each overridden method is allowed/supposed to invoke the methods from the parent
[11:42] mikemccand: once (if) we make the interaction stateless,we suddenly have the freedom to make it a separate interface...
[11:43] ThetaPhi_: you mean calling next() inside accept() -> loops forever
[11:43] mikemccand: yeah :(
[11:43] mikemccand: of course... that freedom of the orig FilteredTermEnum is what made NRQ sneakiness possible in the first place ;)
[11:44] ThetaPhi_: (which was a hack, you have seen the comment: something like: this relys on how setEnum() works, if this changes in the superclass this enum will no longer work)
[11:45] ThetaPhi_: this was a comment before i removed the recursion in one of this issues
[11:45] mikemccand: right -- but it was the right hting to do at the time
[11:45] mikemccand: sure was tricky to undertsand :)
[11:45] ThetaPhi_: (ore remove the recursion, like it is now in trunk)
[11:45] mikemccand: you were bound by FTE
[11:45] mikemccand: yes, that's better
[11:45] mikemccand: but i like separate interface best -- it removes all shared state -- one thing describes what's filtered, the other implements according to that
[11:46] mikemccand: then MTQ could almost simply accept a TermsEnum.  the only difference is difference() -- ha
[11:46] mikemccand: it's only fuzzy that uses difference() right?
[11:46] ThetaPhi_: yes
[11:46] ThetaPhi_: i was thinking about that
[11:47] ThetaPhi_: because a empty prefix filter could simply return the termsenum
[11:47] mikemccand: ahh yes nice opto
[11:47] ThetaPhi_: PrefixQuery("")
[11:47] ThetaPhi_: would match all documents that have at least any term
[11:47] mikemccand: actually how come contrib/queries' TermsFilter isn't a query filter wrapper around an MTQ?
[11:47] ThetaPhi_: in automaton we need this for the catch all .* case
[11:47] mikemccand: ahh
[11:48] ThetaPhi_: at the moment it generates a PrefixTermsEnum("")
[11:48] mikemccand: so maybe AcceptResult has a float difference?
[11:48] ThetaPhi_: which is slower than just returning the TermsEnum of the reader, just because of difference
[11:49] ThetaPhi_: because it calls startsWith for each term (which in fact does nothing because termref.length==0
[11:49] mikemccand: ugh
[11:49] ThetaPhi_: so not that bad
[11:49] mikemccand: much better to simple return .terms() for field.  or, rewrite to MatchAllDocs
[11:49] ThetaPhi_: or a termrangequery(null,null) (often used in solr)
[11:49] ThetaPhi_: no
[11:49] ThetaPhi_: MatchAllDocs is different
[11:50] ThetaPhi_: because the Prefix("") case only returns document that have at least one term of that field
[11:50] mikemccand: ahhh yes
[11:50] ThetaPhi_: this is needed for facetting i think in Solr
[11:50] ThetaPhi_: not sure
[11:50] mikemccand: got it
[11:50] ThetaPhi_: in NRQ its fast, it just enumerates all low-prec terms
[11:51] ThetaPhi_: 16 for precStep=4
[11:51] mikemccand: nice
[11:51] mikemccand: NRQ is a great step forward for lucene
[11:51] mikemccand: would be nice if MTQ could simply accept a TermsEnum
[11:52] mikemccand: problem is difference()
[11:52] ThetaPhi_: yes
[11:52] ThetaPhi_: and is only used by fuzzy
[11:52] mikemccand: yes, annoying for just that one case
[11:52] mikemccand: contrib's TermsFilter really ought to be an MTQ
[11:52] ThetaPhi_: because robert asked for a alltermsenum, which is stupied
[11:53] mikemccand: yeah
[11:53] ThetaPhi_: i only said, user prefix("") whoich is ok, the overhead is minimal, but in this case it would be zero
[11:53] mikemccand: we could allow MTQ to accept either a TersmsEnum, or, a TermsEnumFilter (this new stateless interface, with AcceptResult also having a float difference field)
[11:53] ThetaPhi_: aaaah
[11:54] ThetaPhi_: but that looks like attributesource
[11:54] ThetaPhi_: differenceattribute
[11:54] mikemccand: yeah that's true
[11:54] ThetaPhi_: termsenums are already with attributes
[11:54] mikemccand: curious.  so then MTQ could accept only TermsEnum, but, ask for its attrs, and if that's non-null, as for differenceattr, and if htat's non null, use it
[11:55] ThetaPhi_: fuzzyenum just add this attribute
[11:55] mikemccand: i like that!
[11:55] ThetaPhi_: and fuzzyquery requests it
[11:55] mikemccand: and MTQ respects it too
[11:55] ThetaPhi_: and the default of this attribute is 1.0
[11:55] mikemccand: ye
[11:55] mikemccand: yes
[11:55] ThetaPhi_: so mtq just asks always for this attribute in scoring boolean rewrite
[11:56] mikemccand: yes
[11:56] mikemccand: and, MTQ is much simplified to accept a TermsEnum for its terms
[11:56] ThetaPhi_: yes
[11:56] mikemccand: and we can make a stateless API for filtering a TermsEnum
[11:57] ThetaPhi_: by the way, I did not check ever<ything in flex
[11:57] mikemccand: that's ok -- it's an immense number of changes!
[11:57] ThetaPhi_: but if you wrap an enum, you must override attributes() to return the attributes of the delegate
[11:57] mikemccand: i still keep finding bugs in the emulation layers.
[11:57] mikemccand: hmm you're right
[11:58] ThetaPhi_: so UnionTermsEnum or like so must override attributes() and call delegate.attributes()
[11:58] mikemccand: or, fuzzy query could be the only one that does htis
[11:58] ThetaPhi_: yes
[11:58] ThetaPhi_: no
[11:58] ThetaPhi_: let it in MTQ
[11:58] ThetaPhi_: we yesterday talked, normally rewrite should be final in MTQ
[11:58] mikemccand: that sounds good
[11:58] ThetaPhi_: you should only change behaviour in getEnum
[11:58] mikemccand: ok
[11:59] ThetaPhi_: in trunk we currently rewrite WildCardQuery to PrefixQuery which is itsself an MTQ
[11:59] ThetaPhi_: totally useless
[11:59] ThetaPhi_: just return PrefixEnum in getEnum
[11:59] mikemccand: ahh right
[12:00] ThetaPhi_: in flex i changed
[12:00] ThetaPhi_: the problem are bw tests in trunk, because they check the rewritten thing, but we could simply remove the tests
[12:01] mikemccand: tests shouldn't check internals like that
[12:01] ThetaPhi_: yes
[12:01] ThetaPhi_: or these checks should be marked
[12:01] mikemccand: yes
[12:01] ThetaPhi_: in Junit4 with an annotation
[12:01] mikemccand: ahh what annotation?
[12:01] ThetaPhi_: and the bw tests only run tests without that annotation
[12:02] mikemccand: that sounds great
[12:02] ThetaPhi_: the problem may then still compilation
[12:03] ThetaPhi_: if it checks package protected fields and so on
[12:03] ThetaPhi_: brrr
[12:03] ThetaPhi_: we have such tests
[12:03] mikemccand: sigh.
[12:03] mikemccand: yes
[12:03] mikemccand: i remember committing them ;)
[12:03] *** mikemccand has left #lucene.
[12:03] ThetaPhi_: you always have to remove them
[12:22] *** mikemccand has joined #lucene.
[12:22] mikemccand: sorry, dropped off
[12:25] ThetaPhi_: ok i said
[12:25] ThetaPhi_: i will try out a little bit
[12:25] ThetaPhi_: and post in the automaton
[12:25] mikemccand: ok
[12:25] ThetaPhi_: its hard to always generate both pacthes
[12:26] ThetaPhi_: when we are happy, i will post a patch to 2110 and then we commit
[12:26] ThetaPhi_: in automaton there is more special cases to test
{quote}

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786613#action_12786613 ]

Robert Muir commented on LUCENE-2110:
-------------------------------------

Hey guys, in my opinion it would make it easier if we could somehow iterate in the flex branch under this issue.

Do we really need a monster patch right now that is 100% perfect or can we exploit having this branch available to make discussions and review easier?

For example, it seems everyone agrees the current patch here is a good "step forward".

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2110:
----------------------------------

    Attachment: LUCENE-2110.patch

New patch that implements the Attributes implementation to power scoring BooleanQuery searches. It changes:

- MTQ.getTermsEnum now returns TermsEnum no longer a filtered one
- As in flex, MTQ can only work on one field (because TermsEnum is limited to one Field), it gets a getField() method (highlighter is happy) and a ctor to set this final field (protected). Removed all thoise methods from sub classes. Old classes that still use TermEnum and call the deprec default ctor, still work (because the field is returned in the FilteredTermEnum), but highligter gots no field then (null). Whne BW is removed, highlighter do not need to use FakeIndexReader anymore.
- WildcardQuery/PrefixQuery/TermRangeQuery was optimized to no longer rewrite, but just return the correct enum in getEnum/getTermsEnum(). The special cases for these queries are match all (directly return the enum from reader - new!!!, nothing EmptyTermsEnum, or SingleTermsEnum)
- Fixed Highlighter for MTQ.getField()
- FuzzyQuery now uses an attribute for the term difference: MTQ.TermDifferenceAttribute. For scoring booolean rewrite it is added by the consumer and keeps 1 per default, but fuzzy enum also adds it in its ctor and sets in accept()). Also the PQ boolean rewrite mode is now a RewriteMode.

I think thats all. Happy reviewing.

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786663#action_12786663 ]

Michael McCandless commented on LUCENE-2110:
--------------------------------------------

Patch looks good Uwe -- just TestWildcard.testPrefixTerm, on back-compat tests, that's failing.  This is a nice simplification of MTQs...

I'm wondering if we can further simplify the "seek" case in FilteredTermsEnum so that it's not a 2 step process (FTE.accept returns YES|NO_AND_SEEK, then, FTE.next calls nextSeekTerm), ie instead accept would return the new seek term in the returned status.  This would also make the interaction between FTE and its subclass stateless.  But let's take that up under a separate issue.  I think this one is ready to go into flex branch?

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2110:
----------------------------------

    Attachment: LUCENE-2110.patch

New patch:
- better error message in NPE
- change TermDifferenceAttribute to BoostAttribute and also its method names.
- added Javadocs to FilteredTermEnum to deprec difference and explian that you should use BoostAttribute

I think this is committable :-)

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786705#action_12786705 ]

Uwe Schindler commented on LUCENE-2110:
---------------------------------------

I'll commit this soon after I added some javadocs to BoostAttribute.

> Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty()
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Resolved: (LUCENE-2110) Refactoring of FilteredTermsEnum and MultiTermQuery

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler resolved LUCENE-2110.
-----------------------------------

       Resolution: Fixed
    Lucene Fields: [New, Patch Available]  (was: [New])

Committed revision: 887779

> Refactoring of FilteredTermsEnum and MultiTermQuery
> ---------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> FilteredTermsEnums cannot implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.
> This issue changes some of the internal behaviour of MTQ and FilteredTermsEnum to allow also seeking in NRQ and Automaton (see comments below).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2110) Refactoring of FilteredTermsEnum and MultiTermQuery

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2110:
----------------------------------

    Description:
FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
FilteredTermsEnums cannot implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.
This issue changes some of the internal behaviour of MTQ and FilteredTermsEnum to allow also seeking in NRQ and Automaton (see comments below).

  was:
FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
Also document that not all FilteredTermsEnums may implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.

        Summary: Refactoring of FilteredTermsEnum and MultiTermQuery  (was: Change FilteredTermsEnum to work like Iterator, so it is not positioned and next() must be always called first. Remove empty())

> Refactoring of FilteredTermsEnum and MultiTermQuery
> ---------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> FilteredTermsEnums cannot implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.
> This issue changes some of the internal behaviour of MTQ and FilteredTermsEnum to allow also seeking in NRQ and Automaton (see comments below).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2110) Refactoring of FilteredTermsEnum and MultiTermQuery

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786717#action_12786717 ]

Uwe Schindler commented on LUCENE-2110:
---------------------------------------

I forgot: Thank you all, Mike, Robert, Mark for all your suggestions in chat!

> Refactoring of FilteredTermsEnum and MultiTermQuery
> ---------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> FilteredTermsEnums cannot implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.
> This issue changes some of the internal behaviour of MTQ and FilteredTermsEnum to allow also seeking in NRQ and Automaton (see comments below).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2110) Refactoring of FilteredTermsEnum and MultiTermQuery

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2110:
----------------------------------

    Attachment: LUCENE-2110-nextSeekTermUpd.patch

Here a small update that changed nextSeekTerm to also provide the current TermRef. It will be null on first call. This removes the ugly if (unitialized) code in Automaton.

I will now work on a better approach with only accept using an extended AcceptStatus with next seek term). The current patch will now be committed.

> Refactoring of FilteredTermsEnum and MultiTermQuery
> ---------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110-nextSeekTermUpd.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> FilteredTermsEnums cannot implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.
> This issue changes some of the internal behaviour of MTQ and FilteredTermsEnum to allow also seeking in NRQ and Automaton (see comments below).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Issue Comment Edited: (LUCENE-2110) Refactoring of FilteredTermsEnum and MultiTermQuery

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787401#action_12787401 ]

Uwe Schindler edited comment on LUCENE-2110 at 12/8/09 10:42 AM:
-----------------------------------------------------------------

Here a small update that changed nextSeekTerm to also provide the current TermRef. It will be null on first call. This removes the ugly if (unitialized) code in Automaton.

Also the delegate enum is now private, as its state should be unchangeable by subclasses.

I will now work on a better approach with only accept using an extended AcceptStatus with next seek term). The current patch will now be committed.

      was (Author: thetaphi):
    Here a small update that changed nextSeekTerm to also provide the current TermRef. It will be null on first call. This removes the ugly if (unitialized) code in Automaton.

I will now work on a better approach with only accept using an extended AcceptStatus with next seek term). The current patch will now be committed.
 

> Refactoring of FilteredTermsEnum and MultiTermQuery
> ---------------------------------------------------
>
>                 Key: LUCENE-2110
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2110
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2110-nextSeekTermUpd.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch, LUCENE-2110.patch
>
>
> FilteredTermsEnum is confusing as it is initially positioned to the first term. It should instead work like an uninitialized TermsEnum for a field before the first call to next() or seek().
> FilteredTermsEnums cannot implement seek() as eg. NRQ or Automaton are not able to support this. Seeking is also not needed for MTQ at all, so seek can just throw UOE.
> This issue changes some of the internal behaviour of MTQ and FilteredTermsEnum to allow also seeking in NRQ and Automaton (see comments below).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

12