Site search upsells & boosting by content type

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Site search upsells & boosting by content type

Adoleo
Good afternoon!

I was in the IRC room earlier this morning with a problem, and I'm still having difficulty with it.  I'm trying to do a site search upsell so that sponsored results can be highlighted and boosted to the top of the results. I need to have my default operator set to AND, because if it is set to OR I get rather unpredictable results.  For example, out of an index of 300k items, a search for "old 97's" yields about 116k because they have the words "old" or "97" in them.  With the default operator set to AND, I get 54 results, which would be the expected behaviour for the user who wants to find articles and events about that band.

Unfortunately, I can't boost certain queries with the default operator set to AND, because it adds those terms as a required clause to the search.  I need the boosted terms to be an optional clause. I'm trying to do what the docs talk about here: http://wiki.apache.org/solr/SolrRelevancyCookbook#Boosting_Ranking_Terms

So, my example search is "+(old 97's) id:events.event.88468^100" - which should search for the old 97's and optionally boost that individual event if it is part of the search results. When I run that search with the default operator set to AND, it is parsed into '+(+text:old +PhraseQuery(text:"97 s")) +id:events.event.88468^100.0' - making the particular event a required component of the search and returning only that 1 result.

When I alter my search to "+(old 97's) || id:events.event.88468^100", it parses that to "(+text:old +text:"97 s") id:events.event.88468^100.0", which at first appeared to do what I wanted.  With just "old 97's", I get 54 results.  With  "+(old 97's) || id:events.event.88468^100", I get 54 results with that particular event on top.  However, if I try to boost another term, such as "+(old 97's) || granada^100" - I get over 300 results because it adds in all of the matches for the word "granada".  This is not what I want.  Instead of AND or OR, I want AND MAYBE.

This is supported by the Xapian backend that I'm switching from, so I'm really hoping there's a way to do this in Solr.  Thank you very much for any help you can provide!

-Brandon
Reply | Threaded
Open this post in threaded view
|

Re: Site search upsells & boosting by content type

hossman


: 54 results with that particular event on top.  However, if I try to
: boost another term, such as "+(old 97's) || granada^100" - I get over
: 300 results because it adds in all of the matches for the word

In Solr/Lucene, the keywords of "AND" and "OR" are really just syntactic
sugar for making two clauses mandatory or optional -- which means that
something like this...

        +FOO || BAR

...causes a "FOO" clause to be created which is mandatory, and then a
BAR clause is created and both the FOO and BAR clause are set to optional.
(because of the binary OR specificed by "||")

You can see all of this if you look at the parsedquery in the
debugQuery=true output.

The sucky part of overriding the "default operator" is that when you set
it to "AND" there is no syntax to force a clause to be "optional" .. which
is why i recommend *never* changing the default operrator, and using "+"
to denote when you wnat to make things mandatory.

: "granada".  This is not what I want.  Instead of AND or OR, I want AND
: MAYBE.

In Lucene/Solr there is (really) no "AND" or "OR" or "AND MAYBE" .. there
are just "MANDATORY" "PROHIBITED" and "OPTIONAL" ... in the expression
"+FOO BAR" FOO becomes MANDATORY and BAR becomes optional, which is
equivilent to "AND MAYBE" in other parsers.

nine times out of ten, when people are asking questions like this, the
best answer is:

  1) use the dismax parser
  2) put the input from your user in the q param
  3) set the mm param to 100%
  4) put the boost query you want to use in the bq param


-Hoss