[jira] [Commented] (SOLR-13699) maxChars no longer working as designed on CopyField

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-13699) maxChars no longer working as designed on CopyField

Nick Burch (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16909162#comment-16909162 ]

Jan Høydahl commented on SOLR-13699:
------------------------------------

In this particular case I guess we can replace 
{code:java}
if( val instanceof String && cf.getMaxChars() > 0 ) {{code}
with
{code:java}
if( val instanceof CharSequence && cf.getMaxChars() > 0 ) {{code}
But how do we guard against other code locations expecting {{String}} explicitly? @[~noble.paul] any suggestions? 

> maxChars no longer working as designed on CopyField
> ---------------------------------------------------
>
>                 Key: SOLR-13699
>                 URL: https://issues.apache.org/jira/browse/SOLR-13699
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public)
>    Affects Versions: 7.7, 7.7.1, 7.7.2, 8.0, 8.0.1, 8.1, 8.2, 7.7.3, 8.1.1, 8.1.2
>            Reporter: Chris Troullis
>            Assignee: Erick Erickson
>            Priority: Major
>
> We recently upgraded from Solr 7.3 to 8.1, and noticed that the maxChars property on a copy field is no longer functioning as designed, while indexing via SolrJ. Per the most recent documentation it looks like there have been no intentional changes as to the functionality of this property, so I assume this is a bug.
>   
>  In debugging the issue, it looks like the bug was caused by SOLR-12992. In DocumentBuilder where the maxChar limit is applied, it first checks if the value is instanceof String. As of SOLR-12992, string values are now coming in as ByteArrayUtf8CharSequence (unless they are above a certain size as defined by JavaBinCodec.MAX_UTF8_SZ), so they are failing the instanceof String check, and the maxChar truncation is not being applied. I am currently not sure if this issue is limited to indexing via SolrJ or if it applies to documents indexed via any means



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]