[jira] [Commented] (SOLR-12768) Determine how _nest_path_ should be analyzed to support various use-cases

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (SOLR-12768) Determine how _nest_path_ should be analyzed to support various use-cases

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-12768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16739026#comment-16739026 ]

David Smiley commented on SOLR-12768:

This patch has the PR changes that mosh & I worked out.  I fixed one bug in creating the implicit field type where I simply forgot to set the type's name, which must be done with an additional setter.  The main change mosh did was change the path to always start with a '/', as that's both more consistent with paths in general and allowed a straight-forward manipulation of a relative path to find it at any ancestor by adding an asterisk then slash at the front of a query.

There is more work to do though.  This implicit field type can get persisted out but for this particular type it's not working correctly because the code that does that (FieldType.getAnalyzerProperties) only works with a TokenizerChain subclass of Analyzer.  The result was no analyzers printed:
<fieldType name="_nest_path_" class="org.apache.solr.schema.SortableTextField" omitTermFreqAndPositions="true" omitNorms="true" maxCharsForDocValues="-1" stored="false"/>
I should both fix this and ensure we have a test that'd break if I didn't catch that.
Or alternatively, maybe such implicit types shouldn't be serialized at all?

> Determine how _nest_path_ should be analyzed to support various use-cases
> -------------------------------------------------------------------------
>                 Key: SOLR-12768
>                 URL: https://issues.apache.org/jira/browse/SOLR-12768
>             Project: Solr
>          Issue Type: Sub-task
>      Security Level: Public(Default Security Level. Issues are Public)
>            Reporter: David Smiley
>            Assignee: David Smiley
>            Priority: Blocker
>             Fix For: 8.0
>         Attachments: SOLR-12768.patch, SOLR-12768.patch
>          Time Spent: 2h
>  Remaining Estimate: 0h
> We know we need {{\_nest\_path\_}} in the schema for the new nested documents support, and we loosely know what goes in it.  From a DocValues perspective, we've got it down; though we might tweak it.  From an indexing (text analysis) perspective, we're not quite sure yet, though we've got a test schema, {{schema-nest.xml}} with a decent shot at it.  Ultimately, how we index it will depend on the query/filter use-cases we need to support.  So we'll review some of them here.
> TBD: Not sure if the outcome of this task is just a "decide" or wether we also potentially add a few tests for some of these cases, and/or if we also add a FieldType to make declaring it as easy as a one-liner.  A FieldType would have other benefits too once we're ready to make querying on the path easier.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]