[jira] [Commented] (LUCENE-5611) Simplify the default indexing chain

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (LUCENE-5611) Simplify the default indexing chain

Hudson (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984299#comment-13984299 ]

Robert Muir commented on LUCENE-5611:

I think another followup for this issue should be to do something about all the conflicting term vector option possibilities. Maybe it should have something more like IndexOptions. Just something to think about.

Anyway I did benchmarking and reviewing, +1 to commit the change. its way simpler and easier to work with.

> Simplify the default indexing chain
> -----------------------------------
>                 Key: LUCENE-5611
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5611
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.9, 5.0
>         Attachments: LUCENE-5611.patch, LUCENE-5611.patch
> I think Lucene's current indexing chain has too many classes /
> hierarchy / abstractions, making it look much more complex than it
> really should be, and discouraging users from experimenting/innovating
> with their own indexing chains.
> Also, if it were easier to understand/approach, then new developers
> would more likely try to improve it ... it really should be simpler.
> So I'm exploring a pared back indexing chain, and have a starting patch
> that I think is looking ok: it seems more approachable than the
> current indexing chain, or at least has fewer strange classes.
> I also thought this could give some speedup for tiny documents (a more
> common use of Lucene lately), and it looks like, with the evil
> optimizations, this is a ~25% speedup for Geonames docs.  Even without
> those evil optos it's a bit faster.
> This is very much a work in progress / nocommits, and there are some
> behavior changes e.g. the new chain requires all fields to have the
> same TV options (rather than auto-upgrading all fields by the same
> name that the current chain does)...

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]