Query time boosting with dismax

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Query time boosting with dismax

Grease
Hi,

Is it possible to weigh specific query terms with a Dismax query parser? Is
it possible to write queries of the sort ...
field1:(term1)^2.0 + (term2^3.0)
with dismax?

Thanks,
Girish Redekar
http://girishredekar.net
Reply | Threaded
Open this post in threaded view
|

Re: Query time boosting with dismax

Otis Gospodnetic-2
Terms no, but fields (with terms) and phrases, yes.


Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



----- Original Message ----

> From: Girish Redekar <[hidden email]>
> To: [hidden email]
> Sent: Fri, December 4, 2009 11:42:16 PM
> Subject: Query time boosting with dismax
>
> Hi,
>
> Is it possible to weigh specific query terms with a Dismax query parser? Is
> it possible to write queries of the sort ...
> field1:(term1)^2.0 + (term2^3.0)
> with dismax?
>
> Thanks,
> Girish Redekar
> http://girishredekar.net

Reply | Threaded
Open this post in threaded view
|

Re: Query time boosting with dismax

Uri Boness
You can actually define boost queries to do that (bq parameter). Boost
queries accept the standard Lucene query syntax and eventually appended
to the user query. Just make sure that the default operator is set to OR
other wise these boost queries will not only influence the boosts but
also filter out some of the results.

Otis Gospodnetic wrote:

> Terms no, but fields (with terms) and phrases, yes.
>
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
>
>
>
> ----- Original Message ----
>  
>> From: Girish Redekar <[hidden email]>
>> To: [hidden email]
>> Sent: Fri, December 4, 2009 11:42:16 PM
>> Subject: Query time boosting with dismax
>>
>> Hi,
>>
>> Is it possible to weigh specific query terms with a Dismax query parser? Is
>> it possible to write queries of the sort ...
>> field1:(term1)^2.0 + (term2^3.0)
>> with dismax?
>>
>> Thanks,
>> Girish Redekar
>> http://girishredekar.net
>>    
>
>
>  
Reply | Threaded
Open this post in threaded view
|

Re: Query time boosting with dismax

Erik Hatcher-4
Are you sure about the default operator and bq?  I assume we're  
talking about the setting in schema.xml.

I think boosting queries are OR'd in automatically to the main query:

 From DismaxQParser#addBoostQuery()
   ... query.add(f, BooleanClause.Occur.SHOULD);...

There is one case where query.add((BooleanClause) c); is used though.

        Erik


On Dec 5, 2009, at 6:54 AM, Uri Boness wrote:

> You can actually define boost queries to do that (bq parameter).  
> Boost queries accept the standard Lucene query syntax and eventually  
> appended to the user query. Just make sure that the default operator  
> is set to OR other wise these boost queries will not only influence  
> the boosts but also filter out some of the results.
>
> Otis Gospodnetic wrote:
>> Terms no, but fields (with terms) and phrases, yes.
>>
>>
>> Otis
>> --
>> Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
>>
>>
>>
>> ----- Original Message ----
>>
>>> From: Girish Redekar <[hidden email]>
>>> To: [hidden email]
>>> Sent: Fri, December 4, 2009 11:42:16 PM
>>> Subject: Query time boosting with dismax
>>>
>>> Hi,
>>>
>>> Is it possible to weigh specific query terms with a Dismax query  
>>> parser? Is
>>> it possible to write queries of the sort ...
>>> field1:(term1)^2.0 + (term2^3.0)
>>> with dismax?
>>>
>>> Thanks,
>>> Girish Redekar
>>> http://girishredekar.net
>>>
>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: Query time boosting with dismax

Uri Boness
Well.. this is mainly based on some experiments I did (not based on the
code base). It appeared as if the boost queries were appended to the
generated dismax query and if the default operator (in the schema) was
set to AND it actually filtered out the request. For example, here's a
dismax config:

<requestHandler name="dismax" class="solr.SearchHandler" default="true">
    <lst name="defaults">
     <str name="defType">dismax</str>
     <str name="qf">
        text^0.5 name^1.0 category^1.2
     </str>
     <str name="bq">
        *category:Audio name:black*
     </str>
     <str name="fl">
        *,score
     </str>
     ...
  </requestHandler>

When searching with a default OR operator, you receive more results than
with an AND operator. Checking out the generated query using
debugQuery=true reviles the following:

Generated query with default OR operator:
+DisjunctionMaxQuery((category:black^1.2 | text:black^0.5 |
name:black)~0.01) DisjunctionMaxQuery((category:black^1.5 |
text:black^0.5 | name:black^1.2)~0.01) *category:Audio name:black*
FunctionQuery((product(sint(rating),const(-1.0)))^0.5)

Generated query with default AND operator:
+DisjunctionMaxQuery((category:black^1.2 | text:black^0.5 |
name:black)~0.01) DisjunctionMaxQuery((category:black^1.5 |
text:black^0.5 | name:black^1.2)~0.01) *+category:Audio +name:black*
FunctionQuery((product(sint(rating),const(-1.0)))^0.5)

So when it's an AND, both clauses are marked as MUST in the overall
query, which in turn filters the query. Indeed, I would expect it to add
these queries as SHOULD and then the generated query would look like:
+DisjunctionMaxQuery((category:black^1.2 | text:black^0.5 |
name:black)~0.01) DisjunctionMaxQuery((category:black^1.5 |
text:black^0.5 | name:black^1.2)~0.01) (*+category:Audio +name:black*)
FunctionQuery((product(sint(rating),const(-1.0)))^0.5)

Cheers,
Uri

Erik Hatcher wrote:

> Are you sure about the default operator and bq?  I assume we're
> talking about the setting in schema.xml.
>
> I think boosting queries are OR'd in automatically to the main query:
>
> From DismaxQParser#addBoostQuery()
>   ... query.add(f, BooleanClause.Occur.SHOULD);...
>
> There is one case where query.add((BooleanClause) c); is used though.
>
>     Erik
>
>
> On Dec 5, 2009, at 6:54 AM, Uri Boness wrote:
>
>> You can actually define boost queries to do that (bq parameter).
>> Boost queries accept the standard Lucene query syntax and eventually
>> appended to the user query. Just make sure that the default operator
>> is set to OR other wise these boost queries will not only influence
>> the boosts but also filter out some of the results.
>>
>> Otis Gospodnetic wrote:
>>> Terms no, but fields (with terms) and phrases, yes.
>>>
>>>
>>> Otis
>>> --
>>> Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
>>>
>>>
>>>
>>> ----- Original Message ----
>>>
>>>> From: Girish Redekar <[hidden email]>
>>>> To: [hidden email]
>>>> Sent: Fri, December 4, 2009 11:42:16 PM
>>>> Subject: Query time boosting with dismax
>>>>
>>>> Hi,
>>>>
>>>> Is it possible to weigh specific query terms with a Dismax query
>>>> parser? Is
>>>> it possible to write queries of the sort ...
>>>> field1:(term1)^2.0 + (term2^3.0)
>>>> with dismax?
>>>>
>>>> Thanks,
>>>> Girish Redekar
>>>> http://girishredekar.net
>>>>
>>>
>>>
>>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Query time boosting with dismax

Uri Boness
In reply to this post by Erik Hatcher-4
Checking it further by looking at the code, it seems that in most cases
it indeed adds the boost queries as SHOULD. But if you define *one* bq
parameter which contains a boolean query, then each clause in this
boolean query will be added to the query as is. Therefore:

This set up will filter the query:
<str name="bq">
        +category:Audio +name:black
</str>

This set up will *not* filter the query:
<str name="bq">
        +category:Audio
</str>
<str name="bq">
        +name:black
</str>

So, in the first set up, the default operator as defined in the schema
plays a role.

Cheers,
Uri

Erik Hatcher wrote:

> Are you sure about the default operator and bq?  I assume we're
> talking about the setting in schema.xml.
>
> I think boosting queries are OR'd in automatically to the main query:
>
> From DismaxQParser#addBoostQuery()
>   ... query.add(f, BooleanClause.Occur.SHOULD);...
>
> There is one case where query.add((BooleanClause) c); is used though.
>
>     Erik
>
>
> On Dec 5, 2009, at 6:54 AM, Uri Boness wrote:
>
>> You can actually define boost queries to do that (bq parameter).
>> Boost queries accept the standard Lucene query syntax and eventually
>> appended to the user query. Just make sure that the default operator
>> is set to OR other wise these boost queries will not only influence
>> the boosts but also filter out some of the results.
>>
>> Otis Gospodnetic wrote:
>>> Terms no, but fields (with terms) and phrases, yes.
>>>
>>>
>>> Otis
>>> --
>>> Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
>>>
>>>
>>>
>>> ----- Original Message ----
>>>
>>>> From: Girish Redekar <[hidden email]>
>>>> To: [hidden email]
>>>> Sent: Fri, December 4, 2009 11:42:16 PM
>>>> Subject: Query time boosting with dismax
>>>>
>>>> Hi,
>>>>
>>>> Is it possible to weigh specific query terms with a Dismax query
>>>> parser? Is
>>>> it possible to write queries of the sort ...
>>>> field1:(term1)^2.0 + (term2^3.0)
>>>> with dismax?
>>>>
>>>> Thanks,
>>>> Girish Redekar
>>>> http://girishredekar.net
>>>>
>>>
>>>
>>>
>
>