Impossibility to pass filedName to analysers through TokenizerChain::getStream()

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Impossibility to pass filedName to analysers through TokenizerChain::getStream()

Egor Pahomov
    I have different stop-word dictionaries per field, but all these fields are captured by the single dynamic field i.e. single field type i.e. single analyser.

    It seems I need an improved TokenFilter, which is aware of the field name, which it analyzes. Now filedName is passed into TokenizerChain.getStream(), but it's not used there. How I can pass filedName to token filters?

    I'm thinking of adding a new method TokenStream create(String field, TokenStream input) into TokenFilterFactory interface, then implement it in BaseTokenFilterFactory via calling the single argument create(TokenStream input). After that I'd be able to pass fieldName to TokenFilterFactory in TokenizerChain.getStream(String fieldName, Reader reader). As an alternative I can introduce FieldAwareTokenFilterFactory interface with two args create() and use "instanceof" in TokenizerChain.getStream().
    Is it a good solution for my problem?

    Egor