[jira] [Commented] (LUCENE-8496) Explore selective dimension indexing in BKDReader/Writer

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-8496) Explore selective dimension indexing in BKDReader/Writer

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16645761#comment-16645761 ]

Steve Rowe commented on LUCENE-8496:
------------------------------------

FYI two other failing tests on branch_7x from [https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/2891/] (before the commit was reverted):

{noformat}
ant test -Dtestcase=TestLucene60PointsFormat -Dtests.seed=B5A28E6677965A99 -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=fr-CA -Dtests.timezone=Asia/Irkutsk -Dtests.asserts=true -Dtests.file.encoding=UTF-8
{noformat}

{noformat}
ant test -Dtestcase=TestAssertingPointsFormat -Dtests.seed=F280908F18AE1657 -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=dz -Dtests.timezone=Etc/GMT-10 -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
{noformat}

> Explore selective dimension indexing in BKDReader/Writer
> --------------------------------------------------------
>
>                 Key: LUCENE-8496
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8496
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Nicholas Knize
>            Priority: Major
>         Attachments: LUCENE-8496.patch, LUCENE-8496.patch, LUCENE-8496.patch, LUCENE-8496.patch, LUCENE-8496.patch, LatLonShape_SelectiveEncoding.patch
>
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> This issue explores adding a new feature to BKDReader/Writer that enables users to select a fewer number of dimensions to be used for creating the BKD index than the total number of dimensions specified for field encoding. This is useful for encoding dimensional data that is used for interpreting the encoded field data but unnecessary (or not efficient) for creating the index structure. One such example is {{LatLonShape}} encoding. The first 4 dimensions may be used to to efficiently search/index the triangle using its precomputed bounding box as a 4D point, and the remaining dimensions can be used to encode the vertices of the tessellated triangle. This causes BKD to act much like an R-Tree for shape data where search is distilled into a 4D point (instead of a more expensive 6D point) and the triangle is encoded using a portion of the remaining (non-indexed) dimensions. Fields that use the full data range for indexing are not impacted and behave as they normally would.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]