[GitHub] [lucene-solr] msokolov commented on issue #862: LUCENE-8971: Enable constructing JapaneseTokenizer with custom dictio…

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [lucene-solr] msokolov commented on issue #862: LUCENE-8971: Enable constructing JapaneseTokenizer with custom dictio…

GitBox
msokolov commented on issue #862: LUCENE-8971: Enable constructing JapaneseTokenizer with custom dictio…
URL: https://github.com/apache/lucene-solr/pull/862#issuecomment-530494478
 
 
   > Should it be marked experimental then ? The fact that we ship a single dictionary within the jar also ensures that it is built from the same version but this change breaks this assumption. What kind of compatibility are we expecting here ? Should we require users to rebuild binary dictionary on each minor version ?
   
   Yes, these are good questions. I think experimental makes sense for this given that we are not providing detailed documentation and really only experts with knowledge of NLP will ever use this. With expert features there is no compatibility guarantee, so I think that rebuilding with each version would be the recommended policy. I would think users would be well-advised to rebuild whenever they build their software, treating the Kuromoji dictionary as a binary artifact produced from (textual dictionary) source code. Does that make sense?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]