analyzer, indexAnalyzer and queryAnalyzer

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

analyzer, indexAnalyzer and queryAnalyzer

Steven White
Hi Everyone,

Looking at Solr's schema.xml, there are three kind of analyzers: analyzer,
indexAnalyzer and queryAnalyzer.  I have two questions about them:

1) If the content of indexAnalyzer and queryAnalyzer are exactly the same,
that's the same as if I have an analyzer only, right?

2) Under the hood, all three are the same thing when it comes to what kind
of data and configuration attributes can take, right?

What I'm trying to figure out is this: beside being able to configure a
fieldType to have different analyzer setting at index and query time, there
is nothing else that's unique about each.

Thanks

Steve
Reply | Threaded
Open this post in threaded view
|

Re: analyzer, indexAnalyzer and queryAnalyzer

Doug Turnbull
*> 1) If the content of indexAnalyzer and queryAnalyzer are exactly the
same,that's the same as if I have an analyzer only, right?*
1) Yes

*>  2) Under the hood, all three are the same thing when it comes to what
kind*
*of data and configuration attributes can take, right?*
2) Yes. Both take in text and output a token stream.

*>What I'm trying to figure out is this: beside being able to configure a*

*fieldType to have different analyzer setting at index and query time,
thereis nothing else that's unique about each.*

The only thing to look out for in Solr land is the query parser. Most Solr
query parsers treat whitespace as meaningful.

For example, if a user searches for q=hot dogs&defType=edismax&qf=title
body the *query parser* *not* the *analyzer* first turns the query into:

(title:hot title:dog) | (body:hot body:dog)

each word which *then *gets analyzed. This is because the query parser
tries to be smart and turn "hot dog" into hot OR dog, or more specifically
making them two must clauses.

This trips quite a few folks up, you can use the field query parser which
uses the field as a phrase query. Hope that helps


--
*Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections,
LLC | 240.476.9983 | http://www.opensourceconnections.com
Author: Taming Search <http://manning.com/turnbull> from Manning
Publications
This e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless
of whether attachments are marked as such.
On Wed, Apr 29, 2015 at 3:41 PM, Steven White <[hidden email]> wrote:

> Hi Everyone,
>
> Looking at Solr's schema.xml, there are three kind of analyzers: analyzer,
> indexAnalyzer and queryAnalyzer.  I have two questions about them:
>
> 1) If the content of indexAnalyzer and queryAnalyzer are exactly the same,
> that's the same as if I have an analyzer only, right?
>
> 2) Under the hood, all three are the same thing when it comes to what kind
> of data and configuration attributes can take, right?
>
> What I'm trying to figure out is this: beside being able to configure a
> fieldType to have different analyzer setting at index and query time, there
> is nothing else that's unique about each.
>
> Thanks
>
> Steve
>
Reply | Threaded
Open this post in threaded view
|

Re: analyzer, indexAnalyzer and queryAnalyzer

Chris Hostetter-3
In reply to this post by Steven White

: 1) If the content of indexAnalyzer and queryAnalyzer are exactly the same,
: that's the same as if I have an analyzer only, right?

Effectively yes.  

Subtle nuance: if you declare 1 analyzer, there is one Analyzer object in
ram.  If you declare both, then there are 2 Analyzer objects in RAM --
even if they are identical.   For some theoretical Analyzer, this might
cause slightly diff behavior (ie: an analyzer that maintains long term
state)

: 2) Under the hood, all three are the same thing when it comes to what kind
: of data and configuration attributes can take, right?

correct.

: What I'm trying to figure out is this: beside being able to configure a
: fieldType to have different analyzer setting at index and query time, there
: is nothing else that's unique about each.

nope.

-Hoss
http://www.lucidworks.com/
Reply | Threaded
Open this post in threaded view
|

Re: analyzer, indexAnalyzer and queryAnalyzer

Steven White
In reply to this post by Doug Turnbull
Hi Doug,

I don't understand what you mean by the following:

> For example, if a user searches for q=hot dogs&defType=edismax&qf=title
> body the *query parser* *not* the *analyzer* first turns the query into:

If I have indexAnalyzer and queryAnalyzer in a fieldType that are 100%
identical, the example you provided, does it stand?  If so, why?  Or do you
mean something totally different by "query parser"?

Thanks

Steve


On Wed, Apr 29, 2015 at 4:18 PM, Doug Turnbull <
[hidden email]> wrote:

> *> 1) If the content of indexAnalyzer and queryAnalyzer are exactly the
> same,that's the same as if I have an analyzer only, right?*
> 1) Yes
>
> *>  2) Under the hood, all three are the same thing when it comes to what
> kind*
> *of data and configuration attributes can take, right?*
> 2) Yes. Both take in text and output a token stream.
>
> *>What I'm trying to figure out is this: beside being able to configure a*
>
> *fieldType to have different analyzer setting at index and query time,
> thereis nothing else that's unique about each.*
>
> The only thing to look out for in Solr land is the query parser. Most Solr
> query parsers treat whitespace as meaningful.
>
> For example, if a user searches for q=hot dogs&defType=edismax&qf=title
> body the *query parser* *not* the *analyzer* first turns the query into:
>
> (title:hot title:dog) | (body:hot body:dog)
>
> each word which *then *gets analyzed. This is because the query parser
> tries to be smart and turn "hot dog" into hot OR dog, or more specifically
> making them two must clauses.
>
> This trips quite a few folks up, you can use the field query parser which
> uses the field as a phrase query. Hope that helps
>
>
> --
> *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections,
> LLC | 240.476.9983 | http://www.opensourceconnections.com
> Author: Taming Search <http://manning.com/turnbull> from Manning
> Publications
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
> On Wed, Apr 29, 2015 at 3:41 PM, Steven White <[hidden email]>
> wrote:
>
> > Hi Everyone,
> >
> > Looking at Solr's schema.xml, there are three kind of analyzers:
> analyzer,
> > indexAnalyzer and queryAnalyzer.  I have two questions about them:
> >
> > 1) If the content of indexAnalyzer and queryAnalyzer are exactly the
> same,
> > that's the same as if I have an analyzer only, right?
> >
> > 2) Under the hood, all three are the same thing when it comes to what
> kind
> > of data and configuration attributes can take, right?
> >
> > What I'm trying to figure out is this: beside being able to configure a
> > fieldType to have different analyzer setting at index and query time,
> there
> > is nothing else that's unique about each.
> >
> > Thanks
> >
> > Steve
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: analyzer, indexAnalyzer and queryAnalyzer

Doug Turnbull
So Solr has the idea of a query parser. The query parser is a convenient
way of passing a search string to Solr and having Solr parse it into
underlying Lucene queries: You can see a list of query parsers here
http://wiki.apache.org/solr/QueryParser

What this means is that the query parser does work to pull terms into
individual clauses *before* analysis is run. It's a parsing layer that sits
outside the analysis chain. This creates problems like the "sea biscuit"
problem, whereby we declare "sea biscuit" as a query time synonym of
"seabiscuit". As you may know synonyms are checked during analysis.
However, if the query parser splits up "sea" from "biscuit" before running
analysis, the query time analyzer will fail. The string "sea" is brought by
itself to the query time analyzer and of course won't match "sea biscuit".
Same with the string "biscuit" in isolation. If the full string "sea
biscuit" was brought to the analyzer, it would see [sea] next to [biscuit]
and declare it a synonym of seabiscuit. Thanks to the query parser, the
analyzer has lost the association between the terms, and both terms aren't
brought together to the analyzer.

My colleague John Berryman wrote a pretty good blog post on this
http://opensourceconnections.com/blog/2013/10/27/why-is-multi-term-synonyms-so-hard-in-solr/

There's several solutions out there that attempt to address this problem.
One from Ted Sullivan at Lucidworks
https://lucidworks.com/blog/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/

Another popular one is the hon-lucene-synonyms plugin:
http://lucene.apache.org/solr/5_1_0/solr-core/org/apache/solr/search/FieldQParserPlugin.html

Yet another work-around is to use the field query parser:
http://lucene.apache.org/solr/5_1_0/solr-core/org/apache/solr/search/FieldQParserPlugin.html

I also tend to write my own query parsers, so on the one hand its annoying
that query parsers have the problems above, on the flipside Solr makes it
very easy to implement whatever parsing you think is appropriatte with a
small bit of Java/Lucene knowledge.

Hopefully that explanation wasn't too deep, but its an important thing to
know about Solr. Are you asking out of curiosity, or do you have a specific
problem?

Thanks
-Doug

On Wed, Apr 29, 2015 at 6:32 PM, Steven White <[hidden email]> wrote:

> Hi Doug,
>
> I don't understand what you mean by the following:
>
> > For example, if a user searches for q=hot dogs&defType=edismax&qf=title
> > body the *query parser* *not* the *analyzer* first turns the query into:
>
> If I have indexAnalyzer and queryAnalyzer in a fieldType that are 100%
> identical, the example you provided, does it stand?  If so, why?  Or do you
> mean something totally different by "query parser"?
>
> Thanks
>
> Steve
>
>
> On Wed, Apr 29, 2015 at 4:18 PM, Doug Turnbull <
> [hidden email]> wrote:
>
> > *> 1) If the content of indexAnalyzer and queryAnalyzer are exactly the
> > same,that's the same as if I have an analyzer only, right?*
> > 1) Yes
> >
> > *>  2) Under the hood, all three are the same thing when it comes to what
> > kind*
> > *of data and configuration attributes can take, right?*
> > 2) Yes. Both take in text and output a token stream.
> >
> > *>What I'm trying to figure out is this: beside being able to configure
> a*
> >
> > *fieldType to have different analyzer setting at index and query time,
> > thereis nothing else that's unique about each.*
> >
> > The only thing to look out for in Solr land is the query parser. Most
> Solr
> > query parsers treat whitespace as meaningful.
> >
> > For example, if a user searches for q=hot dogs&defType=edismax&qf=title
> > body the *query parser* *not* the *analyzer* first turns the query into:
> >
> > (title:hot title:dog) | (body:hot body:dog)
> >
> > each word which *then *gets analyzed. This is because the query parser
> > tries to be smart and turn "hot dog" into hot OR dog, or more
> specifically
> > making them two must clauses.
> >
> > This trips quite a few folks up, you can use the field query parser which
> > uses the field as a phrase query. Hope that helps
> >
> >
> > --
> > *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections,
> > LLC | 240.476.9983 | http://www.opensourceconnections.com
> > Author: Taming Search <http://manning.com/turnbull> from Manning
> > Publications
> > This e-mail and all contents, including attachments, is considered to be
> > Company Confidential unless explicitly stated otherwise, regardless
> > of whether attachments are marked as such.
> > On Wed, Apr 29, 2015 at 3:41 PM, Steven White <[hidden email]>
> > wrote:
> >
> > > Hi Everyone,
> > >
> > > Looking at Solr's schema.xml, there are three kind of analyzers:
> > analyzer,
> > > indexAnalyzer and queryAnalyzer.  I have two questions about them:
> > >
> > > 1) If the content of indexAnalyzer and queryAnalyzer are exactly the
> > same,
> > > that's the same as if I have an analyzer only, right?
> > >
> > > 2) Under the hood, all three are the same thing when it comes to what
> > kind
> > > of data and configuration attributes can take, right?
> > >
> > > What I'm trying to figure out is this: beside being able to configure a
> > > fieldType to have different analyzer setting at index and query time,
> > there
> > > is nothing else that's unique about each.
> > >
> > > Thanks
> > >
> > > Steve
> > >
> >
>



--
*Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections,
LLC | 240.476.9983 | http://www.opensourceconnections.com
Author: Taming Search <http://manning.com/turnbull> from Manning
Publications
This e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless
of whether attachments are marked as such.
Reply | Threaded
Open this post in threaded view
|

Re: analyzer, indexAnalyzer and queryAnalyzer

kaushik A
Hi Doug,

Nice explanation of the query parsers. If you get a chance, can you please
take a quick look at the issue I am facing with multi term synonyms as
well?
http://lucene.472066.n3.nabble.com/Mutli-term-synonyms-tt4200960.html#none
is the problem I am facing. I am now able to perform multi term searches on
most phrases, barring the one's which have special characters used in SOLR.
ie. [], etc.

Your help is much appreciated.

Thanks,
Kaushik

On Wed, Apr 29, 2015 at 9:24 PM, Doug Turnbull <
[hidden email]> wrote:

> So Solr has the idea of a query parser. The query parser is a convenient
> way of passing a search string to Solr and having Solr parse it into
> underlying Lucene queries: You can see a list of query parsers here
> http://wiki.apache.org/solr/QueryParser
>
> What this means is that the query parser does work to pull terms into
> individual clauses *before* analysis is run. It's a parsing layer that sits
> outside the analysis chain. This creates problems like the "sea biscuit"
> problem, whereby we declare "sea biscuit" as a query time synonym of
> "seabiscuit". As you may know synonyms are checked during analysis.
> However, if the query parser splits up "sea" from "biscuit" before running
> analysis, the query time analyzer will fail. The string "sea" is brought by
> itself to the query time analyzer and of course won't match "sea biscuit".
> Same with the string "biscuit" in isolation. If the full string "sea
> biscuit" was brought to the analyzer, it would see [sea] next to [biscuit]
> and declare it a synonym of seabiscuit. Thanks to the query parser, the
> analyzer has lost the association between the terms, and both terms aren't
> brought together to the analyzer.
>
> My colleague John Berryman wrote a pretty good blog post on this
>
> http://opensourceconnections.com/blog/2013/10/27/why-is-multi-term-synonyms-so-hard-in-solr/
>
> There's several solutions out there that attempt to address this problem.
> One from Ted Sullivan at Lucidworks
>
> https://lucidworks.com/blog/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
>
> Another popular one is the hon-lucene-synonyms plugin:
>
> http://lucene.apache.org/solr/5_1_0/solr-core/org/apache/solr/search/FieldQParserPlugin.html
>
> Yet another work-around is to use the field query parser:
>
> http://lucene.apache.org/solr/5_1_0/solr-core/org/apache/solr/search/FieldQParserPlugin.html
>
> I also tend to write my own query parsers, so on the one hand its annoying
> that query parsers have the problems above, on the flipside Solr makes it
> very easy to implement whatever parsing you think is appropriatte with a
> small bit of Java/Lucene knowledge.
>
> Hopefully that explanation wasn't too deep, but its an important thing to
> know about Solr. Are you asking out of curiosity, or do you have a specific
> problem?
>
> Thanks
> -Doug
>
> On Wed, Apr 29, 2015 at 6:32 PM, Steven White <[hidden email]>
> wrote:
>
> > Hi Doug,
> >
> > I don't understand what you mean by the following:
> >
> > > For example, if a user searches for q=hot dogs&defType=edismax&qf=title
> > > body the *query parser* *not* the *analyzer* first turns the query
> into:
> >
> > If I have indexAnalyzer and queryAnalyzer in a fieldType that are 100%
> > identical, the example you provided, does it stand?  If so, why?  Or do
> you
> > mean something totally different by "query parser"?
> >
> > Thanks
> >
> > Steve
> >
> >
> > On Wed, Apr 29, 2015 at 4:18 PM, Doug Turnbull <
> > [hidden email]> wrote:
> >
> > > *> 1) If the content of indexAnalyzer and queryAnalyzer are exactly the
> > > same,that's the same as if I have an analyzer only, right?*
> > > 1) Yes
> > >
> > > *>  2) Under the hood, all three are the same thing when it comes to
> what
> > > kind*
> > > *of data and configuration attributes can take, right?*
> > > 2) Yes. Both take in text and output a token stream.
> > >
> > > *>What I'm trying to figure out is this: beside being able to configure
> > a*
> > >
> > > *fieldType to have different analyzer setting at index and query time,
> > > thereis nothing else that's unique about each.*
> > >
> > > The only thing to look out for in Solr land is the query parser. Most
> > Solr
> > > query parsers treat whitespace as meaningful.
> > >
> > > For example, if a user searches for q=hot dogs&defType=edismax&qf=title
> > > body the *query parser* *not* the *analyzer* first turns the query
> into:
> > >
> > > (title:hot title:dog) | (body:hot body:dog)
> > >
> > > each word which *then *gets analyzed. This is because the query parser
> > > tries to be smart and turn "hot dog" into hot OR dog, or more
> > specifically
> > > making them two must clauses.
> > >
> > > This trips quite a few folks up, you can use the field query parser
> which
> > > uses the field as a phrase query. Hope that helps
> > >
> > >
> > > --
> > > *Doug Turnbull **| *Search Relevance Consultant | OpenSource
> Connections,
> > > LLC | 240.476.9983 | http://www.opensourceconnections.com
> > > Author: Taming Search <http://manning.com/turnbull> from Manning
> > > Publications
> > > This e-mail and all contents, including attachments, is considered to
> be
> > > Company Confidential unless explicitly stated otherwise, regardless
> > > of whether attachments are marked as such.
> > > On Wed, Apr 29, 2015 at 3:41 PM, Steven White <[hidden email]>
> > > wrote:
> > >
> > > > Hi Everyone,
> > > >
> > > > Looking at Solr's schema.xml, there are three kind of analyzers:
> > > analyzer,
> > > > indexAnalyzer and queryAnalyzer.  I have two questions about them:
> > > >
> > > > 1) If the content of indexAnalyzer and queryAnalyzer are exactly the
> > > same,
> > > > that's the same as if I have an analyzer only, right?
> > > >
> > > > 2) Under the hood, all three are the same thing when it comes to what
> > > kind
> > > > of data and configuration attributes can take, right?
> > > >
> > > > What I'm trying to figure out is this: beside being able to
> configure a
> > > > fieldType to have different analyzer setting at index and query time,
> > > there
> > > > is nothing else that's unique about each.
> > > >
> > > > Thanks
> > > >
> > > > Steve
> > > >
> > >
> >
>
>
>
> --
> *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections,
> LLC | 240.476.9983 | http://www.opensourceconnections.com
> Author: Taming Search <http://manning.com/turnbull> from Manning
> Publications
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
>
Reply | Threaded
Open this post in threaded view
|

Re: analyzer, indexAnalyzer and queryAnalyzer

Dan Davis-2
In reply to this post by Doug Turnbull
Hi Doug, nice write-up and 2 questions:

- You write your own QParser plugins - can one keep the features of edismax
for field boosting/phrase-match boosting by subclassing edismax?   Assuming
yes...

- What do pf2 and pf3 do in the edismax query parser?

hon-lucene-synonyms plugin links corrections:

http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/
https://github.com/healthonnet/hon-lucene-synonyms


On Wed, Apr 29, 2015 at 9:24 PM, Doug Turnbull <
[hidden email]> wrote:

> So Solr has the idea of a query parser. The query parser is a convenient
> way of passing a search string to Solr and having Solr parse it into
> underlying Lucene queries: You can see a list of query parsers here
> http://wiki.apache.org/solr/QueryParser
>
> What this means is that the query parser does work to pull terms into
> individual clauses *before* analysis is run. It's a parsing layer that sits
> outside the analysis chain. This creates problems like the "sea biscuit"
> problem, whereby we declare "sea biscuit" as a query time synonym of
> "seabiscuit". As you may know synonyms are checked during analysis.
> However, if the query parser splits up "sea" from "biscuit" before running
> analysis, the query time analyzer will fail. The string "sea" is brought by
> itself to the query time analyzer and of course won't match "sea biscuit".
> Same with the string "biscuit" in isolation. If the full string "sea
> biscuit" was brought to the analyzer, it would see [sea] next to [biscuit]
> and declare it a synonym of seabiscuit. Thanks to the query parser, the
> analyzer has lost the association between the terms, and both terms aren't
> brought together to the analyzer.
>
> My colleague John Berryman wrote a pretty good blog post on this
>
> http://opensourceconnections.com/blog/2013/10/27/why-is-multi-term-synonyms-so-hard-in-solr/
>
> There's several solutions out there that attempt to address this problem.
> One from Ted Sullivan at Lucidworks
>
> https://lucidworks.com/blog/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
>
> Another popular one is the hon-lucene-synonyms plugin:
>
> http://lucene.apache.org/solr/5_1_0/solr-core/org/apache/solr/search/FieldQParserPlugin.html
>
> Yet another work-around is to use the field query parser:
>
> http://lucene.apache.org/solr/5_1_0/solr-core/org/apache/solr/search/FieldQParserPlugin.html
>
> I also tend to write my own query parsers, so on the one hand its annoying
> that query parsers have the problems above, on the flipside Solr makes it
> very easy to implement whatever parsing you think is appropriatte with a
> small bit of Java/Lucene knowledge.
>
> Hopefully that explanation wasn't too deep, but its an important thing to
> know about Solr. Are you asking out of curiosity, or do you have a specific
> problem?
>
> Thanks
> -Doug
>
> On Wed, Apr 29, 2015 at 6:32 PM, Steven White <[hidden email]>
> wrote:
>
> > Hi Doug,
> >
> > I don't understand what you mean by the following:
> >
> > > For example, if a user searches for q=hot dogs&defType=edismax&qf=title
> > > body the *query parser* *not* the *analyzer* first turns the query
> into:
> >
> > If I have indexAnalyzer and queryAnalyzer in a fieldType that are 100%
> > identical, the example you provided, does it stand?  If so, why?  Or do
> you
> > mean something totally different by "query parser"?
> >
> > Thanks
> >
> > Steve
> >
> >
> > On Wed, Apr 29, 2015 at 4:18 PM, Doug Turnbull <
> > [hidden email]> wrote:
> >
> > > *> 1) If the content of indexAnalyzer and queryAnalyzer are exactly the
> > > same,that's the same as if I have an analyzer only, right?*
> > > 1) Yes
> > >
> > > *>  2) Under the hood, all three are the same thing when it comes to
> what
> > > kind*
> > > *of data and configuration attributes can take, right?*
> > > 2) Yes. Both take in text and output a token stream.
> > >
> > > *>What I'm trying to figure out is this: beside being able to configure
> > a*
> > >
> > > *fieldType to have different analyzer setting at index and query time,
> > > thereis nothing else that's unique about each.*
> > >
> > > The only thing to look out for in Solr land is the query parser. Most
> > Solr
> > > query parsers treat whitespace as meaningful.
> > >
> > > For example, if a user searches for q=hot dogs&defType=edismax&qf=title
> > > body the *query parser* *not* the *analyzer* first turns the query
> into:
> > >
> > > (title:hot title:dog) | (body:hot body:dog)
> > >
> > > each word which *then *gets analyzed. This is because the query parser
> > > tries to be smart and turn "hot dog" into hot OR dog, or more
> > specifically
> > > making them two must clauses.
> > >
> > > This trips quite a few folks up, you can use the field query parser
> which
> > > uses the field as a phrase query. Hope that helps
> > >
> > >
> > > --
> > > *Doug Turnbull **| *Search Relevance Consultant | OpenSource
> Connections,
> > > LLC | 240.476.9983 | http://www.opensourceconnections.com
> > > Author: Taming Search <http://manning.com/turnbull> from Manning
> > > Publications
> > > This e-mail and all contents, including attachments, is considered to
> be
> > > Company Confidential unless explicitly stated otherwise, regardless
> > > of whether attachments are marked as such.
> > > On Wed, Apr 29, 2015 at 3:41 PM, Steven White <[hidden email]>
> > > wrote:
> > >
> > > > Hi Everyone,
> > > >
> > > > Looking at Solr's schema.xml, there are three kind of analyzers:
> > > analyzer,
> > > > indexAnalyzer and queryAnalyzer.  I have two questions about them:
> > > >
> > > > 1) If the content of indexAnalyzer and queryAnalyzer are exactly the
> > > same,
> > > > that's the same as if I have an analyzer only, right?
> > > >
> > > > 2) Under the hood, all three are the same thing when it comes to what
> > > kind
> > > > of data and configuration attributes can take, right?
> > > >
> > > > What I'm trying to figure out is this: beside being able to
> configure a
> > > > fieldType to have different analyzer setting at index and query time,
> > > there
> > > > is nothing else that's unique about each.
> > > >
> > > > Thanks
> > > >
> > > > Steve
> > > >
> > >
> >
>
>
>
> --
> *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections,
> LLC | 240.476.9983 | http://www.opensourceconnections.com
> Author: Taming Search <http://manning.com/turnbull> from Manning
> Publications
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
>
Reply | Threaded
Open this post in threaded view
|

Re: analyzer, indexAnalyzer and queryAnalyzer

Doug Turnbull
- You write your own QParser plugins - can one keep the features of edismax
for field boosting/phrase-match boosting by subclassing edismax?   Assuming
yes...

hon-lucene-synonyms does this, but largely by copy pasting the code (sorry
about the broken link!)

pf2 and pf3 take the query "hello my name is doug" and chop it up into two
word phrase searches and three word phrase searches respectively.

For example, with q=hello my name is doug&pf2=title body does

title:"hello my" title:"my name" title:"name is" ... body:"hello my" and so
on

pf3 does the same for three word phrases.

-Doug





On Thu, Apr 30, 2015 at 10:58 AM, Dan Davis <[hidden email]> wrote:

> Hi Doug, nice write-up and 2 questions:
>
> - You write your own QParser plugins - can one keep the features of edismax
> for field boosting/phrase-match boosting by subclassing edismax?   Assuming
> yes...
>
> - What do pf2 and pf3 do in the edismax query parser?
>
> hon-lucene-synonyms plugin links corrections:
>
> http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/
> https://github.com/healthonnet/hon-lucene-synonyms
>
>
> On Wed, Apr 29, 2015 at 9:24 PM, Doug Turnbull <
> [hidden email]> wrote:
>
> > So Solr has the idea of a query parser. The query parser is a convenient
> > way of passing a search string to Solr and having Solr parse it into
> > underlying Lucene queries: You can see a list of query parsers here
> > http://wiki.apache.org/solr/QueryParser
> >
> > What this means is that the query parser does work to pull terms into
> > individual clauses *before* analysis is run. It's a parsing layer that
> sits
> > outside the analysis chain. This creates problems like the "sea biscuit"
> > problem, whereby we declare "sea biscuit" as a query time synonym of
> > "seabiscuit". As you may know synonyms are checked during analysis.
> > However, if the query parser splits up "sea" from "biscuit" before
> running
> > analysis, the query time analyzer will fail. The string "sea" is brought
> by
> > itself to the query time analyzer and of course won't match "sea
> biscuit".
> > Same with the string "biscuit" in isolation. If the full string "sea
> > biscuit" was brought to the analyzer, it would see [sea] next to
> [biscuit]
> > and declare it a synonym of seabiscuit. Thanks to the query parser, the
> > analyzer has lost the association between the terms, and both terms
> aren't
> > brought together to the analyzer.
> >
> > My colleague John Berryman wrote a pretty good blog post on this
> >
> >
> http://opensourceconnections.com/blog/2013/10/27/why-is-multi-term-synonyms-so-hard-in-solr/
> >
> > There's several solutions out there that attempt to address this problem.
> > One from Ted Sullivan at Lucidworks
> >
> >
> https://lucidworks.com/blog/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
> >
> > Another popular one is the hon-lucene-synonyms plugin:
> >
> >
> http://lucene.apache.org/solr/5_1_0/solr-core/org/apache/solr/search/FieldQParserPlugin.html
> >
> > Yet another work-around is to use the field query parser:
> >
> >
> http://lucene.apache.org/solr/5_1_0/solr-core/org/apache/solr/search/FieldQParserPlugin.html
> >
> > I also tend to write my own query parsers, so on the one hand its
> annoying
> > that query parsers have the problems above, on the flipside Solr makes it
> > very easy to implement whatever parsing you think is appropriatte with a
> > small bit of Java/Lucene knowledge.
> >
> > Hopefully that explanation wasn't too deep, but its an important thing to
> > know about Solr. Are you asking out of curiosity, or do you have a
> specific
> > problem?
> >
> > Thanks
> > -Doug
> >
> > On Wed, Apr 29, 2015 at 6:32 PM, Steven White <[hidden email]>
> > wrote:
> >
> > > Hi Doug,
> > >
> > > I don't understand what you mean by the following:
> > >
> > > > For example, if a user searches for q=hot
> dogs&defType=edismax&qf=title
> > > > body the *query parser* *not* the *analyzer* first turns the query
> > into:
> > >
> > > If I have indexAnalyzer and queryAnalyzer in a fieldType that are 100%
> > > identical, the example you provided, does it stand?  If so, why?  Or do
> > you
> > > mean something totally different by "query parser"?
> > >
> > > Thanks
> > >
> > > Steve
> > >
> > >
> > > On Wed, Apr 29, 2015 at 4:18 PM, Doug Turnbull <
> > > [hidden email]> wrote:
> > >
> > > > *> 1) If the content of indexAnalyzer and queryAnalyzer are exactly
> the
> > > > same,that's the same as if I have an analyzer only, right?*
> > > > 1) Yes
> > > >
> > > > *>  2) Under the hood, all three are the same thing when it comes to
> > what
> > > > kind*
> > > > *of data and configuration attributes can take, right?*
> > > > 2) Yes. Both take in text and output a token stream.
> > > >
> > > > *>What I'm trying to figure out is this: beside being able to
> configure
> > > a*
> > > >
> > > > *fieldType to have different analyzer setting at index and query
> time,
> > > > thereis nothing else that's unique about each.*
> > > >
> > > > The only thing to look out for in Solr land is the query parser. Most
> > > Solr
> > > > query parsers treat whitespace as meaningful.
> > > >
> > > > For example, if a user searches for q=hot
> dogs&defType=edismax&qf=title
> > > > body the *query parser* *not* the *analyzer* first turns the query
> > into:
> > > >
> > > > (title:hot title:dog) | (body:hot body:dog)
> > > >
> > > > each word which *then *gets analyzed. This is because the query
> parser
> > > > tries to be smart and turn "hot dog" into hot OR dog, or more
> > > specifically
> > > > making them two must clauses.
> > > >
> > > > This trips quite a few folks up, you can use the field query parser
> > which
> > > > uses the field as a phrase query. Hope that helps
> > > >
> > > >
> > > > --
> > > > *Doug Turnbull **| *Search Relevance Consultant | OpenSource
> > Connections,
> > > > LLC | 240.476.9983 | http://www.opensourceconnections.com
> > > > Author: Taming Search <http://manning.com/turnbull> from Manning
> > > > Publications
> > > > This e-mail and all contents, including attachments, is considered to
> > be
> > > > Company Confidential unless explicitly stated otherwise, regardless
> > > > of whether attachments are marked as such.
> > > > On Wed, Apr 29, 2015 at 3:41 PM, Steven White <[hidden email]>
> > > > wrote:
> > > >
> > > > > Hi Everyone,
> > > > >
> > > > > Looking at Solr's schema.xml, there are three kind of analyzers:
> > > > analyzer,
> > > > > indexAnalyzer and queryAnalyzer.  I have two questions about them:
> > > > >
> > > > > 1) If the content of indexAnalyzer and queryAnalyzer are exactly
> the
> > > > same,
> > > > > that's the same as if I have an analyzer only, right?
> > > > >
> > > > > 2) Under the hood, all three are the same thing when it comes to
> what
> > > > kind
> > > > > of data and configuration attributes can take, right?
> > > > >
> > > > > What I'm trying to figure out is this: beside being able to
> > configure a
> > > > > fieldType to have different analyzer setting at index and query
> time,
> > > > there
> > > > > is nothing else that's unique about each.
> > > > >
> > > > > Thanks
> > > > >
> > > > > Steve
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections,
> > LLC | 240.476.9983 | http://www.opensourceconnections.com
> > Author: Taming Search <http://manning.com/turnbull> from Manning
> > Publications
> > This e-mail and all contents, including attachments, is considered to be
> > Company Confidential unless explicitly stated otherwise, regardless
> > of whether attachments are marked as such.
> >
>



--
*Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections,
LLC | 240.476.9983 | http://www.opensourceconnections.com
Author: Taming Search <http://manning.com/turnbull> from Manning
Publications
This e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless
of whether attachments are marked as such.
Reply | Threaded
Open this post in threaded view
|

RE: analyzer, indexAnalyzer and queryAnalyzer

Davis, Daniel (NIH/NLM) [C]
Thank you.

-----Original Message-----
From: Doug Turnbull [mailto:[hidden email]]
Sent: Thursday, April 30, 2015 11:33 AM
To: [hidden email]; Dan Davis
Subject: Re: analyzer, indexAnalyzer and queryAnalyzer

- You write your own QParser plugins - can one keep the features of edismax
for field boosting/phrase-match boosting by subclassing edismax?   Assuming
yes...

hon-lucene-synonyms does this, but largely by copy pasting the code (sorry about the broken link!)

pf2 and pf3 take the query "hello my name is doug" and chop it up into two word phrase searches and three word phrase searches respectively.

For example, with q=hello my name is doug&pf2=title body does

title:"hello my" title:"my name" title:"name is" ... body:"hello my" and so on

pf3 does the same for three word phrases.

-Doug





On Thu, Apr 30, 2015 at 10:58 AM, Dan Davis <[hidden email]> wrote:

> Hi Doug, nice write-up and 2 questions:
>
> - You write your own QParser plugins - can one keep the features of edismax
> for field boosting/phrase-match boosting by subclassing edismax?   Assuming
> yes...
>
> - What do pf2 and pf3 do in the edismax query parser?
>
> hon-lucene-synonyms plugin links corrections:
>
> http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/
> https://github.com/healthonnet/hon-lucene-synonyms
>
>
> On Wed, Apr 29, 2015 at 9:24 PM, Doug Turnbull <
> [hidden email]> wrote:
>
> > So Solr has the idea of a query parser. The query parser is a
> > convenient way of passing a search string to Solr and having Solr
> > parse it into underlying Lucene queries: You can see a list of query
> > parsers here http://wiki.apache.org/solr/QueryParser
> >
> > What this means is that the query parser does work to pull terms
> > into individual clauses *before* analysis is run. It's a parsing
> > layer that
> sits
> > outside the analysis chain. This creates problems like the "sea biscuit"
> > problem, whereby we declare "sea biscuit" as a query time synonym of
> > "seabiscuit". As you may know synonyms are checked during analysis.
> > However, if the query parser splits up "sea" from "biscuit" before
> running
> > analysis, the query time analyzer will fail. The string "sea" is
> > brought
> by
> > itself to the query time analyzer and of course won't match "sea
> biscuit".
> > Same with the string "biscuit" in isolation. If the full string "sea
> > biscuit" was brought to the analyzer, it would see [sea] next to
> [biscuit]
> > and declare it a synonym of seabiscuit. Thanks to the query parser,
> > the analyzer has lost the association between the terms, and both
> > terms
> aren't
> > brought together to the analyzer.
> >
> > My colleague John Berryman wrote a pretty good blog post on this
> >
> >
> http://opensourceconnections.com/blog/2013/10/27/why-is-multi-term-syn
> onyms-so-hard-in-solr/
> >
> > There's several solutions out there that attempt to address this problem.
> > One from Ted Sullivan at Lucidworks
> >
> >
> https://lucidworks.com/blog/solution-for-multi-term-synonyms-in-lucene
> solr-using-the-auto-phrasing-tokenfilter/
> >
> > Another popular one is the hon-lucene-synonyms plugin:
> >
> >
> http://lucene.apache.org/solr/5_1_0/solr-core/org/apache/solr/search/F
> ieldQParserPlugin.html
> >
> > Yet another work-around is to use the field query parser:
> >
> >
> http://lucene.apache.org/solr/5_1_0/solr-core/org/apache/solr/search/F
> ieldQParserPlugin.html
> >
> > I also tend to write my own query parsers, so on the one hand its
> annoying
> > that query parsers have the problems above, on the flipside Solr
> > makes it very easy to implement whatever parsing you think is
> > appropriatte with a small bit of Java/Lucene knowledge.
> >
> > Hopefully that explanation wasn't too deep, but its an important
> > thing to know about Solr. Are you asking out of curiosity, or do you
> > have a
> specific
> > problem?
> >
> > Thanks
> > -Doug
> >
> > On Wed, Apr 29, 2015 at 6:32 PM, Steven White <[hidden email]>
> > wrote:
> >
> > > Hi Doug,
> > >
> > > I don't understand what you mean by the following:
> > >
> > > > For example, if a user searches for q=hot
> dogs&defType=edismax&qf=title
> > > > body the *query parser* *not* the *analyzer* first turns the
> > > > query
> > into:
> > >
> > > If I have indexAnalyzer and queryAnalyzer in a fieldType that are
> > > 100% identical, the example you provided, does it stand?  If so,
> > > why?  Or do
> > you
> > > mean something totally different by "query parser"?
> > >
> > > Thanks
> > >
> > > Steve
> > >
> > >
> > > On Wed, Apr 29, 2015 at 4:18 PM, Doug Turnbull <
> > > [hidden email]> wrote:
> > >
> > > > *> 1) If the content of indexAnalyzer and queryAnalyzer are
> > > > exactly
> the
> > > > same,that's the same as if I have an analyzer only, right?*
> > > > 1) Yes
> > > >
> > > > *>  2) Under the hood, all three are the same thing when it
> > > > comes to
> > what
> > > > kind*
> > > > *of data and configuration attributes can take, right?*
> > > > 2) Yes. Both take in text and output a token stream.
> > > >
> > > > *>What I'm trying to figure out is this: beside being able to
> configure
> > > a*
> > > >
> > > > *fieldType to have different analyzer setting at index and query
> time,
> > > > thereis nothing else that's unique about each.*
> > > >
> > > > The only thing to look out for in Solr land is the query parser.
> > > > Most
> > > Solr
> > > > query parsers treat whitespace as meaningful.
> > > >
> > > > For example, if a user searches for q=hot
> dogs&defType=edismax&qf=title
> > > > body the *query parser* *not* the *analyzer* first turns the
> > > > query
> > into:
> > > >
> > > > (title:hot title:dog) | (body:hot body:dog)
> > > >
> > > > each word which *then *gets analyzed. This is because the query
> parser
> > > > tries to be smart and turn "hot dog" into hot OR dog, or more
> > > specifically
> > > > making them two must clauses.
> > > >
> > > > This trips quite a few folks up, you can use the field query
> > > > parser
> > which
> > > > uses the field as a phrase query. Hope that helps
> > > >
> > > >
> > > > --
> > > > *Doug Turnbull **| *Search Relevance Consultant | OpenSource
> > Connections,
> > > > LLC | 240.476.9983 | http://www.opensourceconnections.com
> > > > Author: Taming Search <http://manning.com/turnbull> from Manning
> > > > Publications This e-mail and all contents, including
> > > > attachments, is considered to
> > be
> > > > Company Confidential unless explicitly stated otherwise,
> > > > regardless of whether attachments are marked as such.
> > > > On Wed, Apr 29, 2015 at 3:41 PM, Steven White
> > > > <[hidden email]>
> > > > wrote:
> > > >
> > > > > Hi Everyone,
> > > > >
> > > > > Looking at Solr's schema.xml, there are three kind of analyzers:
> > > > analyzer,
> > > > > indexAnalyzer and queryAnalyzer.  I have two questions about them:
> > > > >
> > > > > 1) If the content of indexAnalyzer and queryAnalyzer are
> > > > > exactly
> the
> > > > same,
> > > > > that's the same as if I have an analyzer only, right?
> > > > >
> > > > > 2) Under the hood, all three are the same thing when it comes
> > > > > to
> what
> > > > kind
> > > > > of data and configuration attributes can take, right?
> > > > >
> > > > > What I'm trying to figure out is this: beside being able to
> > configure a
> > > > > fieldType to have different analyzer setting at index and
> > > > > query
> time,
> > > > there
> > > > > is nothing else that's unique about each.
> > > > >
> > > > > Thanks
> > > > >
> > > > > Steve
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > *Doug Turnbull **| *Search Relevance Consultant | OpenSource
> > Connections, LLC | 240.476.9983 |
> > http://www.opensourceconnections.com
> > Author: Taming Search <http://manning.com/turnbull> from Manning
> > Publications This e-mail and all contents, including attachments, is
> > considered to be Company Confidential unless explicitly stated
> > otherwise, regardless of whether attachments are marked as such.
> >
>



--
*Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections, LLC | 240.476.9983 | http://www.opensourceconnections.com
Author: Taming Search <http://manning.com/turnbull> from Manning Publications This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.
Reply | Threaded
Open this post in threaded view
|

Re: analyzer, indexAnalyzer and queryAnalyzer

Steven White
In reply to this post by Doug Turnbull
Thanks Doug.  This is extremely helpful.  It is much appreciated that you
took the time to write it all.

Do we have a Solr / Lucene wiki with such "did you know?" write ups?  If
not, just having this kind of knowledge in an email isn't good enough as it
won't be as searchable as a wiki.

Steve

On Wed, Apr 29, 2015 at 9:24 PM, Doug Turnbull <
[hidden email]> wrote:

> So Solr has the idea of a query parser. The query parser is a convenient
> way of passing a search string to Solr and having Solr parse it into
> underlying Lucene queries: You can see a list of query parsers here
> http://wiki.apache.org/solr/QueryParser
>
> What this means is that the query parser does work to pull terms into
> individual clauses *before* analysis is run. It's a parsing layer that sits
> outside the analysis chain. This creates problems like the "sea biscuit"
> problem, whereby we declare "sea biscuit" as a query time synonym of
> "seabiscuit". As you may know synonyms are checked during analysis.
> However, if the query parser splits up "sea" from "biscuit" before running
> analysis, the query time analyzer will fail. The string "sea" is brought by
> itself to the query time analyzer and of course won't match "sea biscuit".
> Same with the string "biscuit" in isolation. If the full string "sea
> biscuit" was brought to the analyzer, it would see [sea] next to [biscuit]
> and declare it a synonym of seabiscuit. Thanks to the query parser, the
> analyzer has lost the association between the terms, and both terms aren't
> brought together to the analyzer.
>
> My colleague John Berryman wrote a pretty good blog post on this
>
> http://opensourceconnections.com/blog/2013/10/27/why-is-multi-term-synonyms-so-hard-in-solr/
>
> There's several solutions out there that attempt to address this problem.
> One from Ted Sullivan at Lucidworks
>
> https://lucidworks.com/blog/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
>
> Another popular one is the hon-lucene-synonyms plugin:
>
> http://lucene.apache.org/solr/5_1_0/solr-core/org/apache/solr/search/FieldQParserPlugin.html
>
> Yet another work-around is to use the field query parser:
>
> http://lucene.apache.org/solr/5_1_0/solr-core/org/apache/solr/search/FieldQParserPlugin.html
>
> I also tend to write my own query parsers, so on the one hand its annoying
> that query parsers have the problems above, on the flipside Solr makes it
> very easy to implement whatever parsing you think is appropriatte with a
> small bit of Java/Lucene knowledge.
>
> Hopefully that explanation wasn't too deep, but its an important thing to
> know about Solr. Are you asking out of curiosity, or do you have a specific
> problem?
>
> Thanks
> -Doug
>
> On Wed, Apr 29, 2015 at 6:32 PM, Steven White <[hidden email]>
> wrote:
>
> > Hi Doug,
> >
> > I don't understand what you mean by the following:
> >
> > > For example, if a user searches for q=hot dogs&defType=edismax&qf=title
> > > body the *query parser* *not* the *analyzer* first turns the query
> into:
> >
> > If I have indexAnalyzer and queryAnalyzer in a fieldType that are 100%
> > identical, the example you provided, does it stand?  If so, why?  Or do
> you
> > mean something totally different by "query parser"?
> >
> > Thanks
> >
> > Steve
> >
> >
> > On Wed, Apr 29, 2015 at 4:18 PM, Doug Turnbull <
> > [hidden email]> wrote:
> >
> > > *> 1) If the content of indexAnalyzer and queryAnalyzer are exactly the
> > > same,that's the same as if I have an analyzer only, right?*
> > > 1) Yes
> > >
> > > *>  2) Under the hood, all three are the same thing when it comes to
> what
> > > kind*
> > > *of data and configuration attributes can take, right?*
> > > 2) Yes. Both take in text and output a token stream.
> > >
> > > *>What I'm trying to figure out is this: beside being able to configure
> > a*
> > >
> > > *fieldType to have different analyzer setting at index and query time,
> > > thereis nothing else that's unique about each.*
> > >
> > > The only thing to look out for in Solr land is the query parser. Most
> > Solr
> > > query parsers treat whitespace as meaningful.
> > >
> > > For example, if a user searches for q=hot dogs&defType=edismax&qf=title
> > > body the *query parser* *not* the *analyzer* first turns the query
> into:
> > >
> > > (title:hot title:dog) | (body:hot body:dog)
> > >
> > > each word which *then *gets analyzed. This is because the query parser
> > > tries to be smart and turn "hot dog" into hot OR dog, or more
> > specifically
> > > making them two must clauses.
> > >
> > > This trips quite a few folks up, you can use the field query parser
> which
> > > uses the field as a phrase query. Hope that helps
> > >
> > >
> > > --
> > > *Doug Turnbull **| *Search Relevance Consultant | OpenSource
> Connections,
> > > LLC | 240.476.9983 | http://www.opensourceconnections.com
> > > Author: Taming Search <http://manning.com/turnbull> from Manning
> > > Publications
> > > This e-mail and all contents, including attachments, is considered to
> be
> > > Company Confidential unless explicitly stated otherwise, regardless
> > > of whether attachments are marked as such.
> > > On Wed, Apr 29, 2015 at 3:41 PM, Steven White <[hidden email]>
> > > wrote:
> > >
> > > > Hi Everyone,
> > > >
> > > > Looking at Solr's schema.xml, there are three kind of analyzers:
> > > analyzer,
> > > > indexAnalyzer and queryAnalyzer.  I have two questions about them:
> > > >
> > > > 1) If the content of indexAnalyzer and queryAnalyzer are exactly the
> > > same,
> > > > that's the same as if I have an analyzer only, right?
> > > >
> > > > 2) Under the hood, all three are the same thing when it comes to what
> > > kind
> > > > of data and configuration attributes can take, right?
> > > >
> > > > What I'm trying to figure out is this: beside being able to
> configure a
> > > > fieldType to have different analyzer setting at index and query time,
> > > there
> > > > is nothing else that's unique about each.
> > > >
> > > > Thanks
> > > >
> > > > Steve
> > > >
> > >
> >
>
>
>
> --
> *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections,
> LLC | 240.476.9983 | http://www.opensourceconnections.com
> Author: Taming Search <http://manning.com/turnbull> from Manning
> Publications
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
>
Reply | Threaded
Open this post in threaded view
|

Re: analyzer, indexAnalyzer and queryAnalyzer

Shawn Heisey-2
On 5/4/2015 6:29 AM, Steven White wrote:
> Thanks Doug.  This is extremely helpful.  It is much appreciated that you
> took the time to write it all.
>
> Do we have a Solr / Lucene wiki with such "did you know?" write ups?  If
> not, just having this kind of knowledge in an email isn't good enough as it
> won't be as searchable as a wiki.

There is a community-editable wiki.  If you want write permission, just
create an account on that wiki and let us know (either here or on the
#solr IRC channel) what your username is, and we can get you added to
the contributors group.

https://wiki.apache.org/solr

The Apache Solr Reference Guide is kept on another wiki system, but the
only committers can edit that wiki, because it is released as official
documentation.  Community users can comment on its pages if they have
suggestions.

https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide

Everything that happens on both wikis is visible to anyone who
subscribes to the commits mailing list, so if there is good information
available that should go into the official documentation, editing the
community wiki or commenting on the reference guide is usually enough to
make the committers aware of it.

You can find information on the various mailing lists here:

https://lucene.apache.org/core/discussion.html
https://lucene.apache.org/solr/resources.html#mailing-lists

Thanks,
Shawn