Strange "the" when search with dismax

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Strange "the" when search with dismax

marship
Hi. All.
   I am using solr dismax to search over my books in db. I indexed them all using solr.
   the problem I noticed today is,
Everything start with I want to search for a book "
The Girl Who Kicked the Hornet's Nest
"
but nothing is returned. I'm sure I have this book in DB. So I stripped some keyword and finally I found when I search for "the girl who kicked hornet's nest" , I got the book.
Then I test more
when I search for "the first world war", solr return the book successfully to me.
But when I search for "the first world war the", solr returns NOTHING!


So strange!
So the issue is, if there are 2 "the" in query keywords, solr/dismax simply return nothing!


Why is this happening?


Please help.
Thanks.
Regards.
Scott



Reply | Threaded
Open this post in threaded view
|

Re: Strange "the" when search with dismax

Grijesh
This post has NOT been accepted by the mailing list yet.
that type of problems occur when there are differences in analysis at index time and query time.Please also attach what analysis u r doing with that field.
Thanx:
Grijesh
www.gettinhahead.co.in
Reply | Threaded
Open this post in threaded view
|

Re: Strange "the" when search with dismax

kenf_nc
In reply to this post by marship
Sounds like you want the 'text' fieldType (or equivalent) and are using 'string' or 'lowercase'. Those must match all exactly (well, case insensitively in the case of 'lowercase').  The TextType field types (like 'text') do tokenizations so matches will occur under many more conditions.
Reply | Threaded
Open this post in threaded view
|

Re: Strange "the" when search with dismax

Jonathan Rochkind
In reply to this post by marship
"the" sounds like it might be a stopword. Are you using stopwords in any
of your fields covered by the dismax search? But not in some of the
other fields covered by dismax? the combination of dismax and stopwords
can result in unexpected behavior if you aren't careful.

I wrote about this a bit here, you might find it helpful:
http://bibwild.wordpress.com/2010/04/14/solr-stop-wordsdismax-gotcha/

marship wrote:

> Hi. All.
>    I am using solr dismax to search over my books in db. I indexed them all using solr.
>    the problem I noticed today is,
> Everything start with I want to search for a book "
> The Girl Who Kicked the Hornet's Nest
> "
> but nothing is returned. I'm sure I have this book in DB. So I stripped some keyword and finally I found when I search for "the girl who kicked hornet's nest" , I got the book.
> Then I test more
> when I search for "the first world war", solr return the book successfully to me.
> But when I search for "the first world war the", solr returns NOTHING!
>
>
> So strange!
> So the issue is, if there are 2 "the" in query keywords, solr/dismax simply return nothing!
>
>
> Why is this happening?
>
>
> Please help.
> Thanks.
> Regards.
> Scott
>
>
>
>  
Reply | Threaded
Open this post in threaded view
|

Re: Strange "the" when search with dismax

Erick Erickson
If the other suggestions don't work, you need to show us the relevant
portions of your schema.xml, and probably query output with
&debug=on tacked on...

Here are some pointers for getting help...

http://wiki.apache.org/solr/UsingMailingLists

Best
Erick

2010/7/14 Jonathan Rochkind <[hidden email]>

> "the" sounds like it might be a stopword. Are you using stopwords in any
> of your fields covered by the dismax search? But not in some of the
> other fields covered by dismax? the combination of dismax and stopwords
> can result in unexpected behavior if you aren't careful.
>
> I wrote about this a bit here, you might find it helpful:
> http://bibwild.wordpress.com/2010/04/14/solr-stop-wordsdismax-gotcha/
>
> marship wrote:
> > Hi. All.
> >    I am using solr dismax to search over my books in db. I indexed them
> all using solr.
> >    the problem I noticed today is,
> > Everything start with I want to search for a book "
> > The Girl Who Kicked the Hornet's Nest
> > "
> > but nothing is returned. I'm sure I have this book in DB. So I stripped
> some keyword and finally I found when I search for "the girl who kicked
> hornet's nest" , I got the book.
> > Then I test more
> > when I search for "the first world war", solr return the book
> successfully to me.
> > But when I search for "the first world war the", solr returns NOTHING!
> >
> >
> > So strange!
> > So the issue is, if there are 2 "the" in query keywords, solr/dismax
> simply return nothing!
> >
> >
> > Why is this happening?
> >
> >
> > Please help.
> > Thanks.
> > Regards.
> > Scott
> >
> >
> >
> >
>