request dependent analyzer

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

request dependent analyzer

Hendrik Haddorp
Hi,

currently we use a lot of small collections that all basically have the
same schema. This does not scale too well. So we are looking into
combining multiple collections into one. We would however like some
analyzers to behave slightly differently depending on the logical
collection. We would for example like to use different synonyms in the
different logical collections. Is there any clean way on how to do that,
like somehow access request parameters from an analyzer?

regards,
Hendrik
Reply | Threaded
Open this post in threaded view
|

RE: request dependent analyzer

Markus Jelsma-2
Hi - That is impossible. But you can construct many analyzers instead.
 
-----Original message-----

> From:Hendrik Haddorp <[hidden email]>
> Sent: Monday 18th December 2017 8:35
> To: solr-user <[hidden email]>
> Subject: request dependent analyzer
>
> Hi,
>
> currently we use a lot of small collections that all basically have the
> same schema. This does not scale too well. So we are looking into
> combining multiple collections into one. We would however like some
> analyzers to behave slightly differently depending on the logical
> collection. We would for example like to use different synonyms in the
> different logical collections. Is there any clean way on how to do that,
> like somehow access request parameters from an analyzer?
>
> regards,
> Hendrik
>
Reply | Threaded
Open this post in threaded view
|

Re: request dependent analyzer

Hendrik Haddorp
Hi, how do multiple analyzers help?

On 18.12.2017 10:25, Markus Jelsma wrote:

> Hi - That is impossible. But you can construct many analyzers instead.
>  
> -----Original message-----
>> From:Hendrik Haddorp <[hidden email]>
>> Sent: Monday 18th December 2017 8:35
>> To: solr-user <[hidden email]>
>> Subject: request dependent analyzer
>>
>> Hi,
>>
>> currently we use a lot of small collections that all basically have the
>> same schema. This does not scale too well. So we are looking into
>> combining multiple collections into one. We would however like some
>> analyzers to behave slightly differently depending on the logical
>> collection. We would for example like to use different synonyms in the
>> different logical collections. Is there any clean way on how to do that,
>> like somehow access request parameters from an analyzer?
>>
>> regards,
>> Hendrik
>>

Reply | Threaded
Open this post in threaded view
|

Re: request dependent analyzer

Stefan Matheis-3
In reply to this post by Hendrik Haddorp
Hendrik,

this doesn't exactly answer your question, but I do remember reading a
thread on the lucene-dev list which became a jira ticket eventually - not
that long ago.

Doug asked for something that sounds at least a little bit similar to what
you're asking: https://issues.apache.org/jira/browse/SOLR-11698

Hope it's worth reading
- Stefan

On Dec 18, 2017 8:35 AM, "Hendrik Haddorp" <[hidden email]> wrote:

> Hi,
>
> currently we use a lot of small collections that all basically have the
> same schema. This does not scale too well. So we are looking into combining
> multiple collections into one. We would however like some analyzers to
> behave slightly differently depending on the logical collection. We would
> for example like to use different synonyms in the different logical
> collections. Is there any clean way on how to do that, like somehow access
> request parameters from an analyzer?
>
> regards,
> Hendrik
>
Reply | Threaded
Open this post in threaded view
|

RE: request dependent analyzer

Markus Jelsma-2
In reply to this post by Hendrik Haddorp
Hi - for example, in edismax where the query analyzer is retrieved,  you can create your specific analyzer with a customer SynonymsFilter with its own synonyms file. Of course keep a cache of already constructed analyzers.

We also create custom analyzers based on config but with a per-request modification on the fly such as backing config files, or disabling or adding filters on demand. Cache them and reuse, to spare the cost of construction which can be very high in case of large dictionaries or FSTs.

Regards,
Markus

-----Original message-----

> From:Hendrik Haddorp <[hidden email]>
> Sent: Monday 18th December 2017 10:55
> To: [hidden email]
> Subject: Re: request dependent analyzer
>
> Hi, how do multiple analyzers help?
>
> On 18.12.2017 10:25, Markus Jelsma wrote:
> > Hi - That is impossible. But you can construct many analyzers instead.
> >  
> > -----Original message-----
> >> From:Hendrik Haddorp <[hidden email]>
> >> Sent: Monday 18th December 2017 8:35
> >> To: solr-user <[hidden email]>
> >> Subject: request dependent analyzer
> >>
> >> Hi,
> >>
> >> currently we use a lot of small collections that all basically have the
> >> same schema. This does not scale too well. So we are looking into
> >> combining multiple collections into one. We would however like some
> >> analyzers to behave slightly differently depending on the logical
> >> collection. We would for example like to use different synonyms in the
> >> different logical collections. Is there any clean way on how to do that,
> >> like somehow access request parameters from an analyzer?
> >>
> >> regards,
> >> Hendrik
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

RE: request dependent analyzer

Markus Jelsma-2
In reply to this post by Hendrik Haddorp
Thanks, interesting ticket (missed it, following now), this is similar what we use. Construct an analyzer (or get from cache) with a field as argument as its base config. But allow Java's variable arguments syntax that you can use for adding or disabling filters, or modifying a specific filter's parameters.

Regards,
Markus
 
-----Original message-----

> From:Stefan Matheis <[hidden email]>
> Sent: Monday 18th December 2017 23:02
> To: [hidden email]
> Subject: Re: request dependent analyzer
>
> Hendrik,
>
> this doesn't exactly answer your question, but I do remember reading a
> thread on the lucene-dev list which became a jira ticket eventually - not
> that long ago.
>
> Doug asked for something that sounds at least a little bit similar to what
> you're asking: https://issues.apache.org/jira/browse/SOLR-11698
>
> Hope it's worth reading
> - Stefan
>
> On Dec 18, 2017 8:35 AM, "Hendrik Haddorp" <[hidden email]> wrote:
>
> > Hi,
> >
> > currently we use a lot of small collections that all basically have the
> > same schema. This does not scale too well. So we are looking into combining
> > multiple collections into one. We would however like some analyzers to
> > behave slightly differently depending on the logical collection. We would
> > for example like to use different synonyms in the different logical
> > collections. Is there any clean way on how to do that, like somehow access
> > request parameters from an analyzer?
> >
> > regards,
> > Hendrik
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: request dependent analyzer

Doug Turnbull
Yes I would like to get around to implementing that.

You might find out match query parser useful for selecting analyzers at
query time

https://github.com/o19s/match-query-parser


--
Consultant, OpenSource Connections. Contact info at
http://o19s.com/about-us/doug-turnbull/; Free/Busy (http://bit.ly/dougs_cal)