[jira] [Created] (LUCENE-3893) TermsFilter should use AutomatonQuery

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (LUCENE-3893) TermsFilter should use AutomatonQuery

JIRA jira@apache.org
TermsFilter should use AutomatonQuery
-------------------------------------

                 Key: LUCENE-3893
                 URL: https://issues.apache.org/jira/browse/LUCENE-3893
             Project: Lucene - Java
          Issue Type: Improvement
            Reporter: Michael McCandless


I think we could see perf gains if TermsFilter sorted the terms, built a minimal automaton, and used TermsEnum.intersect to visit the terms...

This idea came up on the dev list recently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-3893) TermsFilter should use AutomatonQuery

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233522#comment-13233522 ]

Michael McCandless commented on LUCENE-3893:
--------------------------------------------

LUCENE-3832 should also be done for this...
               

> TermsFilter should use AutomatonQuery
> -------------------------------------
>
>                 Key: LUCENE-3893
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3893
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>              Labels: gsoc2012, lucene-gsoc-12
>
> I think we could see perf gains if TermsFilter sorted the terms, built a minimal automaton, and used TermsEnum.intersect to visit the terms...
> This idea came up on the dev list recently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-3893) TermsFilter should use AutomatonQuery

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233575#comment-13233575 ]

Uwe Schindler commented on LUCENE-3893:
---------------------------------------

I already have something like this implemented when Dawid ported the Dahiwikwukblabla Automaton builder to Lucene core. We dont need a separate Filter or Query for that, just use (for the Filter):

new MultiTermQueryWrapperFilter(new AutomatonQuery(unionAutomaton))

To have a backwards-compatible TermsFilter, we can simply do it like PrfeixFilter (subclass MTQWF) and warp the above Automaton.
               

> TermsFilter should use AutomatonQuery
> -------------------------------------
>
>                 Key: LUCENE-3893
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3893
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>              Labels: gsoc2012, lucene-gsoc-12
>
> I think we could see perf gains if TermsFilter sorted the terms, built a minimal automaton, and used TermsEnum.intersect to visit the terms...
> This idea came up on the dev list recently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-3893) TermsFilter should use AutomatonQuery

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233711#comment-13233711 ]

Dawid Weiss commented on LUCENE-3893:
-------------------------------------

bq. Dahiwikwukblabla

Daciuk, the name is Jan Daciuk :) Although the same algorithm has been discovered independently by Stoyan Mihov and (I think) Bruce W. Watson.
               

> TermsFilter should use AutomatonQuery
> -------------------------------------
>
>                 Key: LUCENE-3893
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3893
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>              Labels: gsoc2012, lucene-gsoc-12
>
> I think we could see perf gains if TermsFilter sorted the terms, built a minimal automaton, and used TermsEnum.intersect to visit the terms...
> This idea came up on the dev list recently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-3893) TermsFilter should use AutomatonQuery

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233722#comment-13233722 ]

Uwe Schindler commented on LUCENE-3893:
---------------------------------------

Thanks! I was on my mobile phone when commenting this and had the names not in mind :-)
               

> TermsFilter should use AutomatonQuery
> -------------------------------------
>
>                 Key: LUCENE-3893
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3893
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>              Labels: gsoc2012, lucene-gsoc-12
>
> I think we could see perf gains if TermsFilter sorted the terms, built a minimal automaton, and used TermsEnum.intersect to visit the terms...
> This idea came up on the dev list recently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-3893) TermsFilter should use AutomatonQuery

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233727#comment-13233727 ]

Dawid Weiss commented on LUCENE-3893:
-------------------------------------

np. I see that Janek even has an audio file with the pronounciation, LOL ;)
http://www.eti.pg.gda.pl/katedry/kiw/pracownicy/Jan.Daciuk/personal/j_daciuk.au
               

> TermsFilter should use AutomatonQuery
> -------------------------------------
>
>                 Key: LUCENE-3893
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3893
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>              Labels: gsoc2012, lucene-gsoc-12
>
> I think we could see perf gains if TermsFilter sorted the terms, built a minimal automaton, and used TermsEnum.intersect to visit the terms...
> This idea came up on the dev list recently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]