Search for "library" returns 0 results, but search for "marion library" returns many results

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Search for "library" returns 0 results, but search for "marion library" returns many results

Sean Adams-Hiett-2
This is cross posted on Drupal.org: http://drupal.org/node/1515046

Summary: I have a fairly clean install of Drupal 7 with
Apachesolr-1.0-beta18. I have created a content type called document with a
number of fields. I am working with 30k+ records, most of which are related
to "Marion, IA" in some way. A search for "library" (without the quotes)
returns no results, while a search for "marion library" returns thousands
of results. That doesn't make any sense to me at all.

Details:
<ul>
  <li>Drupal 7 (latest stable version)</li>
  <li>Apachesolr-1.0-beta18</li>
  <li>Custom content type with many fields</li>
  <li>LAMP stack running on Centos Linode</li>
  <li>PHP 5.2.x</li>
</ul>

I also checked this through the solr admin interface, running the same
searches with similar results, so I can't rule out the possibility that
something is configured wrong... but since I am using the solrconfig.xml
and schema.xml files provided with the modules, it is also a possibility
that the issue lies here as well. I have watched the logs and during the
searches that produce no results but should, there is no output in the log
besides the regular <code>[INFO]</code> about the query.

I am stumped and I am past a deadline with this project, so any help would
be greatly appreciated.

--
Sean Adams-Hiett
Director of Development
The Advantage Companies
[hidden email]
www.advantage-companies.com
Reply | Threaded
Open this post in threaded view
|

RE: Search for "library" returns 0 results, but search for "marion library" returns many results

Joshua Sumali
Did you try to append &debugQuery=on to get more information?

> -----Original Message-----
> From: Sean Adams-Hiett [mailto:[hidden email]]
> Sent: Wednesday, April 04, 2012 10:43 AM
> To: [hidden email]
> Subject: Search for "library" returns 0 results, but search for "marion library"
> returns many results
>
> This is cross posted on Drupal.org: http://drupal.org/node/1515046
>
> Summary: I have a fairly clean install of Drupal 7 with Apachesolr-1.0-beta18. I
> have created a content type called document with a number of fields. I am
> working with 30k+ records, most of which are related to "Marion, IA" in some
> way. A search for "library" (without the quotes) returns no results, while a
> search for "marion library" returns thousands of results. That doesn't make
> any sense to me at all.
>
> Details:
> <ul>
>   <li>Drupal 7 (latest stable version)</li>
>   <li>Apachesolr-1.0-beta18</li>
>   <li>Custom content type with many fields</li>
>   <li>LAMP stack running on Centos Linode</li>
>   <li>PHP 5.2.x</li>
> </ul>
>
> I also checked this through the solr admin interface, running the same
> searches with similar results, so I can't rule out the possibility that something
> is configured wrong... but since I am using the solrconfig.xml and schema.xml
> files provided with the modules, it is also a possibility that the issue lies here
> as well. I have watched the logs and during the searches that produce no
> results but should, there is no output in the log besides the regular
> <code>[INFO]</code> about the query.
>
> I am stumped and I am past a deadline with this project, so any help would
> be greatly appreciated.
>
> --
> Sean Adams-Hiett
> Director of Development
> The Advantage Companies
> [hidden email]
> www.advantage-companies.com
Reply | Threaded
Open this post in threaded view
|

Re: Search for "library" returns 0 results, but search for "marion library" returns many results

Ravish Bhagdev
Yes, can you check if results you get with "marion library" match on marion
or library?  By default solr uses OR between words (specified in
solrconfig.xml).  You can also easily check this by enabling highlighting.

Ravish

On Wed, Apr 4, 2012 at 4:11 PM, Joshua Sumali <[hidden email]> wrote:

> Did you try to append &debugQuery=on to get more information?
>
> > -----Original Message-----
> > From: Sean Adams-Hiett [mailto:[hidden email]]
> > Sent: Wednesday, April 04, 2012 10:43 AM
> > To: [hidden email]
> > Subject: Search for "library" returns 0 results, but search for "marion
> library"
> > returns many results
> >
> > This is cross posted on Drupal.org: http://drupal.org/node/1515046
> >
> > Summary: I have a fairly clean install of Drupal 7 with
> Apachesolr-1.0-beta18. I
> > have created a content type called document with a number of fields. I am
> > working with 30k+ records, most of which are related to "Marion, IA" in
> some
> > way. A search for "library" (without the quotes) returns no results,
> while a
> > search for "marion library" returns thousands of results. That doesn't
> make
> > any sense to me at all.
> >
> > Details:
> > <ul>
> >   <li>Drupal 7 (latest stable version)</li>
> >   <li>Apachesolr-1.0-beta18</li>
> >   <li>Custom content type with many fields</li>
> >   <li>LAMP stack running on Centos Linode</li>
> >   <li>PHP 5.2.x</li>
> > </ul>
> >
> > I also checked this through the solr admin interface, running the same
> > searches with similar results, so I can't rule out the possibility that
> something
> > is configured wrong... but since I am using the solrconfig.xml and
> schema.xml
> > files provided with the modules, it is also a possibility that the issue
> lies here
> > as well. I have watched the logs and during the searches that produce no
> > results but should, there is no output in the log besides the regular
> > <code>[INFO]</code> about the query.
> >
> > I am stumped and I am past a deadline with this project, so any help
> would
> > be greatly appreciated.
> >
> > --
> > Sean Adams-Hiett
> > Director of Development
> > The Advantage Companies
> > [hidden email]
> > www.advantage-companies.com
>
Reply | Threaded
Open this post in threaded view
|

Re: Search for "library" returns 0 results, but search for "marion library" returns many results

Sean Adams-Hiett
Here are some of the XML results with the debug on:

<response>
<result name="response" numFound="0" start="0"/>
<lst name="highlighting"/>
<lst name="debug">
<str name="rawquerystring">library</str>
<str name="querystring">library</str>
<str name="parsedquery">
+DisjunctionMaxQuery((content:librari)~0.01)
DisjunctionMaxQuery((content:librari^2.0)~0.01)
</str>
<str name="parsedquery_toString">+(content:librari)~0.01
(content:librari^2.0)~0.01</str>
<lst name="explain"/>
<str name="QParser">DisMaxQParser</str>
<null name="altquerystring"/>
<null name="boostfuncs"/>
<lst name="timing">
<double name="time">0.0</double>
<lst name="prepare">
<double name="time">0.0</double>
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.SpellCheckComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>
<lst name="process">
<double name="time">0.0</double>
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.SpellCheckComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>
</lst>
</lst>
</response>

It looks like somehow the query is getting converted from "library" to
"librari". Any idea how that would happen?

Sean

On Wed, Apr 4, 2012 at 10:13 AM, Ravish Bhagdev <[hidden email]>wrote:

> Yes, can you check if results you get with "marion library" match on marion
> or library?  By default solr uses OR between words (specified in
> solrconfig.xml).  You can also easily check this by enabling highlighting.
>
> Ravish
>
> On Wed, Apr 4, 2012 at 4:11 PM, Joshua Sumali <[hidden email]> wrote:
>
> > Did you try to append &debugQuery=on to get more information?
> >
> > > -----Original Message-----
> > > From: Sean Adams-Hiett [mailto:[hidden email]]
> > > Sent: Wednesday, April 04, 2012 10:43 AM
> > > To: [hidden email]
> > > Subject: Search for "library" returns 0 results, but search for "marion
> > library"
> > > returns many results
> > >
> > > This is cross posted on Drupal.org: http://drupal.org/node/1515046
> > >
> > > Summary: I have a fairly clean install of Drupal 7 with
> > Apachesolr-1.0-beta18. I
> > > have created a content type called document with a number of fields. I
> am
> > > working with 30k+ records, most of which are related to "Marion, IA" in
> > some
> > > way. A search for "library" (without the quotes) returns no results,
> > while a
> > > search for "marion library" returns thousands of results. That doesn't
> > make
> > > any sense to me at all.
> > >
> > > Details:
> > > <ul>
> > >   <li>Drupal 7 (latest stable version)</li>
> > >   <li>Apachesolr-1.0-beta18</li>
> > >   <li>Custom content type with many fields</li>
> > >   <li>LAMP stack running on Centos Linode</li>
> > >   <li>PHP 5.2.x</li>
> > > </ul>
> > >
> > > I also checked this through the solr admin interface, running the same
> > > searches with similar results, so I can't rule out the possibility that
> > something
> > > is configured wrong... but since I am using the solrconfig.xml and
> > schema.xml
> > > files provided with the modules, it is also a possibility that the
> issue
> > lies here
> > > as well. I have watched the logs and during the searches that produce
> no
> > > results but should, there is no output in the log besides the regular
> > > <code>[INFO]</code> about the query.
> > >
> > > I am stumped and I am past a deadline with this project, so any help
> > would
> > > be greatly appreciated.
> > >
> > > --
> > > Sean Adams-Hiett
> > > Director of Development
> > > The Advantage Companies
> > > [hidden email]
> > > www.advantage-companies.com
> >
>



--
Sean Adams-Hiett
Owner, Web Geeks For Hire
phone: (361) 433.5748
email: [hidden email]
twitter: @geekbusiness <http://twitter.com/geekbusiness>
Reply | Threaded
Open this post in threaded view
|

Re: Search for "library" returns 0 results, but search for "marion library" returns many results

Erik Hatcher-4
> It looks like somehow the query is getting converted from "library" to
> "librari". Any idea how that would happen?

Yeah, that happens from having stemming involved in your query time analysis (look at your field type, you've surely got Snowball in there)

Also, you're using the dismax query parser which has many knobs and dials, and this is why things aren't matching as you'd expect.  You'll want to tinker with some of those settings, especially if you need query multiple fields with varying weights.

        Erik



On Apr 4, 2012, at 12:11 , Sean Adams-Hiett wrote:

> Here are some of the XML results with the debug on:
>
> <response>
> <result name="response" numFound="0" start="0"/>
> <lst name="highlighting"/>
> <lst name="debug">
> <str name="rawquerystring">library</str>
> <str name="querystring">library</str>
> <str name="parsedquery">
> +DisjunctionMaxQuery((content:librari)~0.01)
> DisjunctionMaxQuery((content:librari^2.0)~0.01)
> </str>
> <str name="parsedquery_toString">+(content:librari)~0.01
> (content:librari^2.0)~0.01</str>
> <lst name="explain"/>
> <str name="QParser">DisMaxQParser</str>
> <null name="altquerystring"/>
> <null name="boostfuncs"/>
> <lst name="timing">
> <double name="time">0.0</double>
> <lst name="prepare">
> <double name="time">0.0</double>
> <lst name="org.apache.solr.handler.component.QueryComponent">
> <double name="time">0.0</double>
> </lst>
> <lst name="org.apache.solr.handler.component.FacetComponent">
> <double name="time">0.0</double>
> </lst>
> <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
> <double name="time">0.0</double>
> </lst>
> <lst name="org.apache.solr.handler.component.HighlightComponent">
> <double name="time">0.0</double>
> </lst>
> <lst name="org.apache.solr.handler.component.StatsComponent">
> <double name="time">0.0</double>
> </lst>
> <lst name="org.apache.solr.handler.component.SpellCheckComponent">
> <double name="time">0.0</double>
> </lst>
> <lst name="org.apache.solr.handler.component.DebugComponent">
> <double name="time">0.0</double>
> </lst>
> </lst>
> <lst name="process">
> <double name="time">0.0</double>
> <lst name="org.apache.solr.handler.component.QueryComponent">
> <double name="time">0.0</double>
> </lst>
> <lst name="org.apache.solr.handler.component.FacetComponent">
> <double name="time">0.0</double>
> </lst>
> <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
> <double name="time">0.0</double>
> </lst>
> <lst name="org.apache.solr.handler.component.HighlightComponent">
> <double name="time">0.0</double>
> </lst>
> <lst name="org.apache.solr.handler.component.StatsComponent">
> <double name="time">0.0</double>
> </lst>
> <lst name="org.apache.solr.handler.component.SpellCheckComponent">
> <double name="time">0.0</double>
> </lst>
> <lst name="org.apache.solr.handler.component.DebugComponent">
> <double name="time">0.0</double>
> </lst>
> </lst>
> </lst>
> </lst>
> </response>
>
> It looks like somehow the query is getting converted from "library" to
> "librari". Any idea how that would happen?
>
> Sean
>
> On Wed, Apr 4, 2012 at 10:13 AM, Ravish Bhagdev <[hidden email]>wrote:
>
>> Yes, can you check if results you get with "marion library" match on marion
>> or library?  By default solr uses OR between words (specified in
>> solrconfig.xml).  You can also easily check this by enabling highlighting.
>>
>> Ravish
>>
>> On Wed, Apr 4, 2012 at 4:11 PM, Joshua Sumali <[hidden email]> wrote:
>>
>>> Did you try to append &debugQuery=on to get more information?
>>>
>>>> -----Original Message-----
>>>> From: Sean Adams-Hiett [mailto:[hidden email]]
>>>> Sent: Wednesday, April 04, 2012 10:43 AM
>>>> To: [hidden email]
>>>> Subject: Search for "library" returns 0 results, but search for "marion
>>> library"
>>>> returns many results
>>>>
>>>> This is cross posted on Drupal.org: http://drupal.org/node/1515046
>>>>
>>>> Summary: I have a fairly clean install of Drupal 7 with
>>> Apachesolr-1.0-beta18. I
>>>> have created a content type called document with a number of fields. I
>> am
>>>> working with 30k+ records, most of which are related to "Marion, IA" in
>>> some
>>>> way. A search for "library" (without the quotes) returns no results,
>>> while a
>>>> search for "marion library" returns thousands of results. That doesn't
>>> make
>>>> any sense to me at all.
>>>>
>>>> Details:
>>>> <ul>
>>>>  <li>Drupal 7 (latest stable version)</li>
>>>>  <li>Apachesolr-1.0-beta18</li>
>>>>  <li>Custom content type with many fields</li>
>>>>  <li>LAMP stack running on Centos Linode</li>
>>>>  <li>PHP 5.2.x</li>
>>>> </ul>
>>>>
>>>> I also checked this through the solr admin interface, running the same
>>>> searches with similar results, so I can't rule out the possibility that
>>> something
>>>> is configured wrong... but since I am using the solrconfig.xml and
>>> schema.xml
>>>> files provided with the modules, it is also a possibility that the
>> issue
>>> lies here
>>>> as well. I have watched the logs and during the searches that produce
>> no
>>>> results but should, there is no output in the log besides the regular
>>>> <code>[INFO]</code> about the query.
>>>>
>>>> I am stumped and I am past a deadline with this project, so any help
>>> would
>>>> be greatly appreciated.
>>>>
>>>> --
>>>> Sean Adams-Hiett
>>>> Director of Development
>>>> The Advantage Companies
>>>> [hidden email]
>>>> www.advantage-companies.com
>>>
>>
>
>
>
> --
> Sean Adams-Hiett
> Owner, Web Geeks For Hire
> phone: (361) 433.5748
> email: [hidden email]
> twitter: @geekbusiness <http://twitter.com/geekbusiness>

Reply | Threaded
Open this post in threaded view
|

Re: Search for "library" returns 0 results, but search for "marion library" returns many results

Sean Adams-Hiett
Thanks for all the replies on this. It turns out that the reason that I
wasn't getting the expected results is because I was not properly indexed
one of the fields. My content type display settings for that field were set
to hidden in Drupal. After I corrected this and re-indexed I started
getting the expected results.

Thanks again for all the responses!

Sean

On Thu, Apr 5, 2012 at 10:02 AM, Erik Hatcher <[hidden email]>wrote:

> > It looks like somehow the query is getting converted from "library" to
> > "librari". Any idea how that would happen?
>
> Yeah, that happens from having stemming involved in your query time
> analysis (look at your field type, you've surely got Snowball in there)
>
> Also, you're using the dismax query parser which has many knobs and dials,
> and this is why things aren't matching as you'd expect.  You'll want to
> tinker with some of those settings, especially if you need query multiple
> fields with varying weights.
>
>        Erik
>
>
>
> On Apr 4, 2012, at 12:11 , Sean Adams-Hiett wrote:
>
> > Here are some of the XML results with the debug on:
> >
> > <response>
> > <result name="response" numFound="0" start="0"/>
> > <lst name="highlighting"/>
> > <lst name="debug">
> > <str name="rawquerystring">library</str>
> > <str name="querystring">library</str>
> > <str name="parsedquery">
> > +DisjunctionMaxQuery((content:librari)~0.01)
> > DisjunctionMaxQuery((content:librari^2.0)~0.01)
> > </str>
> > <str name="parsedquery_toString">+(content:librari)~0.01
> > (content:librari^2.0)~0.01</str>
> > <lst name="explain"/>
> > <str name="QParser">DisMaxQParser</str>
> > <null name="altquerystring"/>
> > <null name="boostfuncs"/>
> > <lst name="timing">
> > <double name="time">0.0</double>
> > <lst name="prepare">
> > <double name="time">0.0</double>
> > <lst name="org.apache.solr.handler.component.QueryComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.FacetComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.HighlightComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.StatsComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.SpellCheckComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.DebugComponent">
> > <double name="time">0.0</double>
> > </lst>
> > </lst>
> > <lst name="process">
> > <double name="time">0.0</double>
> > <lst name="org.apache.solr.handler.component.QueryComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.FacetComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.HighlightComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.StatsComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.SpellCheckComponent">
> > <double name="time">0.0</double>
> > </lst>
> > <lst name="org.apache.solr.handler.component.DebugComponent">
> > <double name="time">0.0</double>
> > </lst>
> > </lst>
> > </lst>
> > </lst>
> > </response>
> >
> > It looks like somehow the query is getting converted from "library" to
> > "librari". Any idea how that would happen?
> >
> > Sean
> >
> > On Wed, Apr 4, 2012 at 10:13 AM, Ravish Bhagdev <
> [hidden email]>wrote:
> >
> >> Yes, can you check if results you get with "marion library" match on
> marion
> >> or library?  By default solr uses OR between words (specified in
> >> solrconfig.xml).  You can also easily check this by enabling
> highlighting.
> >>
> >> Ravish
> >>
> >> On Wed, Apr 4, 2012 at 4:11 PM, Joshua Sumali <[hidden email]> wrote:
> >>
> >>> Did you try to append &debugQuery=on to get more information?
> >>>
> >>>> -----Original Message-----
> >>>> From: Sean Adams-Hiett [mailto:[hidden email]]
> >>>> Sent: Wednesday, April 04, 2012 10:43 AM
> >>>> To: [hidden email]
> >>>> Subject: Search for "library" returns 0 results, but search for
> "marion
> >>> library"
> >>>> returns many results
> >>>>
> >>>> This is cross posted on Drupal.org: http://drupal.org/node/1515046
> >>>>
> >>>> Summary: I have a fairly clean install of Drupal 7 with
> >>> Apachesolr-1.0-beta18. I
> >>>> have created a content type called document with a number of fields. I
> >> am
> >>>> working with 30k+ records, most of which are related to "Marion, IA"
> in
> >>> some
> >>>> way. A search for "library" (without the quotes) returns no results,
> >>> while a
> >>>> search for "marion library" returns thousands of results. That doesn't
> >>> make
> >>>> any sense to me at all.
> >>>>
> >>>> Details:
> >>>> <ul>
> >>>>  <li>Drupal 7 (latest stable version)</li>
> >>>>  <li>Apachesolr-1.0-beta18</li>
> >>>>  <li>Custom content type with many fields</li>
> >>>>  <li>LAMP stack running on Centos Linode</li>
> >>>>  <li>PHP 5.2.x</li>
> >>>> </ul>
> >>>>
> >>>> I also checked this through the solr admin interface, running the same
> >>>> searches with similar results, so I can't rule out the possibility
> that
> >>> something
> >>>> is configured wrong... but since I am using the solrconfig.xml and
> >>> schema.xml
> >>>> files provided with the modules, it is also a possibility that the
> >> issue
> >>> lies here
> >>>> as well. I have watched the logs and during the searches that produce
> >> no
> >>>> results but should, there is no output in the log besides the regular
> >>>> <code>[INFO]</code> about the query.
> >>>>
> >>>> I am stumped and I am past a deadline with this project, so any help
> >>> would
> >>>> be greatly appreciated.
> >>>>
> >>>> --
> >>>> Sean Adams-Hiett
> >>>> Director of Development
> >>>> The Advantage Companies
> >>>> [hidden email]
> >>>> www.advantage-companies.com
> >>>
> >>
> >
> >
> >
> > --
> > Sean Adams-Hiett
> > Owner, Web Geeks For Hire
> > phone: (361) 433.5748
> > email: [hidden email]
> > twitter: @geekbusiness <http://twitter.com/geekbusiness>
>
>


--
Sean Adams-Hiett
Owner, Web Geeks For Hire
phone: (361) 433.5748
email: [hidden email]
twitter: @geekbusiness <http://twitter.com/geekbusiness>