[jira] [Commented] (SOLR-1535) Pre-analyzed field type

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-1535) Pre-analyzed field type

Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-1535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253325#comment-13253325 ]

Jan Høydahl commented on SOLR-1535:
-----------------------------------

I wish I had time to do the Avro stuff now, but just go ahead with whatever you choose.

Since this format potentially will be adopted by many 3rd party frameworks we should take multi language support and back-compat seriously, so we do not end up in a similar situation as with JavaBin v1/v2... Perhaps a JSON structure with Base64 for binaries and a mandatory version attribute is a good generic start?
               

> Pre-analyzed field type
> -----------------------
>
>                 Key: SOLR-1535
>                 URL: https://issues.apache.org/jira/browse/SOLR-1535
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.5
>            Reporter: Andrzej Bialecki
>             Fix For: 4.0
>
>         Attachments: SOLR-1535.patch, SOLR-1535.patch, preanalyzed.patch, preanalyzed.patch
>
>
> PreAnalyzedFieldType provides a functionality to index (and optionally store) content that was already processed and split into tokens using some external processing chain. This implementation defines a serialization format for sending tokens with any currently supported Attributes (eg. type, posIncr, payload, ...). This data is de-serialized into a regular TokenStream that is returned in Field.tokenStreamValue() and thus added to the index as index terms, and optionally a stored part that is returned in Field.stringValue() and is then added as a stored value of the field.
> This field type is useful for integrating Solr with existing text-processing pipelines, such as third-party NLP systems.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]