on getting back LUCENE-6212 behaviour ;) , Re: On LUCENE-5611 and 6.4.1

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

on getting back LUCENE-6212 behaviour ;) , Re: On LUCENE-5611 and 6.4.1

Ľuboš Koščo
D'oh !!!

OK, I see where this happened, we were leveraging this:

So changing the subject - can I get 6212 behaviour back in latest lucene somehow ???

Resp. what I am doing now is to use the fields setTokenStream
to have its own tokenstream per doc ...
does the tokenstream need to be private, or can one instance be reused ?


On 17 February 2017 at 09:27, Ľuboš Koščo <[hidden email]> wrote:
One more Q before I can work on tests

how does recent lucene pick appropriate analyzer for the doc?
Were you doing some changes in that area since 4.7.1 ?
(if we decide the indexing chain didn't influence this and still uses analyzer properly picked)
(I checked changelogs and didn't find any suspicious change in that area ... )


On 11 February 2017 at 00:47, Michael McCandless <[hidden email]> wrote:
Could you make a small standalone test case showing what used to work
and what no longer works?

I don't think that issue was supposed to alter how IndexWriter
interacts with the analysis chain.

Mike McCandless


On Fri, Feb 10, 2017 at 9:48 AM, Ľuboš Koščo <[hidden email]> wrote:
> Resp. how to make the double inherited analyzer (on the bottom of
> inheritance) be used again, instead of hidden by its father direct
> descendant of Analyzer?
> (father:
> https://github.com/OpenGrok/OpenGrok/blob/master/src/org/opensolaris/opengrok/analysis/FileAnalyzer.java
> child:
> https://github.com/OpenGrok/OpenGrok/blob/master/src/org/opensolaris/opengrok/analysis/java/JavaAnalyzer.java
> - looking at above it's even deeper inheritance, so Analyzer -> FileAnalyzer
> -> ... ->JavaAnalyzer as the last child)
> (funny enough the code on our side that creates docs didn't really change
> since 4.7.1 , but new lucene now picks FileAnalyzer over any other analyzer
> for createComponents anyways)
> tia
> L
> On 10 February 2017 at 13:41, Ľuboš Koščo <[hidden email]> wrote:
>> Hi guys, Mike
>> is there any chance I can somehow get the indexing chain to behave similar
>> as before LUCENE-5611 in 6.4.1 ?
>> We used to have analyzers that inherited multiple times from Analyzer
>> (e.g. second child and relaxed and overriden createComponents) and lucene
>> used to run them for appropriate docs properly
>> but after LUCENE-5611 I can see the chain changed and only the first child
>> is always taken into account, even though the document is handled by proper
>> analyzer ...
>> (basically between 4.7.1 and 6.4.1 something changed that made lucene just
>> ignore second child of analyzer and won't use it and always use first one
>> (and its father, the direct override of createComponents))
>> Some code pointers on what used to work and now isn't :
>> https://github.com/OpenGrok/OpenGrok/issues/1376
>> (and I tried to dig the changelogs and the only thing I found is really
>> around 5611, hence this silly Q)
>> any clues how to get old behaviour back?
>> thnx
>> L