OR query on multiple fields causes low coord

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

OR query on multiple fields causes low coord

M. Mokotov
Hi,
 
I have a question with regards to an OR query on multiple fields.
 
It seems that the more fields I'm splitting the documents into, the lower
the coord is getting.
As a result when I want to query the string S on many fields (a query like
F1:(S) F2:(S) ... Fn:(S) ) I'm getting close-to-zero coords, which causes a
poor matching score.
I assume (and forgive me for assuming) that the reason is when calling
coord( overlap, maxOverlap ), maxOverlap=|S|*n (where n is the number of
fields on the query)
 
Is there any way to avoid that?
Can I have the coord computed per field?
 
Thanks a lot for the help!
Reply | Threaded
Open this post in threaded view
|

Re: OR query on multiple fields causes low coord

Paul Elschot
On Thursday 09 June 2005 13:14, M. Mokotov wrote:

> Hi,
>  
> I have a question with regards to an OR query on multiple fields.
>  
> It seems that the more fields I'm splitting the documents into, the lower
> the coord is getting.
> As a result when I want to query the string S on many fields (a query like
> F1:(S) F2:(S) ... Fn:(S) ) I'm getting close-to-zero coords, which causes a
> poor matching score.
> I assume (and forgive me for assuming) that the reason is when calling
> coord( overlap, maxOverlap ), maxOverlap=|S|*n (where n is the number of
> fields on the query)
>  
> Is there any way to avoid that?
> Can I have the coord computed per field?

Yes. For the query above, use a BooleanQuery with a Similarity that
has a constant returning coord() method. This is difficult to do the
QueryParser, but it is easy to construct it in your own code.
For the subqueries on the fields you can still use the default similarity,
as you see fit.

Have a look at the MultiFieldQueryParser in the source:
http://svn.apache.org/viewcvs.cgi/lucene/java/tags/lucene_1_4_3/src/java/org/apache/lucene/queryParser/
Instead of the BooleanQuery constructed there, use a BooleanQuery
that overrides getSimilarity().

Regards,
Paul Elschot


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: OR query on multiple fields causes low coord

M. Mokotov
Hi Paul,

Thanks for the help.

-----Original Message-----
From: Paul Elschot [mailto:[hidden email]]
Sent: Thursday, June 09, 2005 8:11 PM
To: [hidden email]
Subject: Re: OR query on multiple fields causes low coord


On Thursday 09 June 2005 13:14, M. Mokotov wrote:

> Hi,
>  
> I have a question with regards to an OR query on multiple fields.
>  
> It seems that the more fields I'm splitting the documents into, the
> lower the coord is getting. As a result when I want to query the
> string S on many fields (a query like
> F1:(S) F2:(S) ... Fn:(S) ) I'm getting close-to-zero coords, which
> causes a poor matching score. I assume (and forgive me for assuming)
> that the reason is when calling coord( overlap, maxOverlap ),
> maxOverlap=|S|*n (where n is the number of fields on the query)
>  
> Is there any way to avoid that?
> Can I have the coord computed per field?

Yes. For the query above, use a BooleanQuery with a Similarity that has a
constant returning coord() method. This is difficult to do the QueryParser,
but it is easy to construct it in your own code. For the subqueries on the
fields you can still use the default similarity, as you see fit.

Have a look at the MultiFieldQueryParser in the source:
http://svn.apache.org/viewcvs.cgi/lucene/java/tags/lucene_1_4_3/src/java/org
/apache/lucene/queryParser/
Instead of the BooleanQuery constructed there, use a BooleanQuery that
overrides getSimilarity().

Regards,
Paul Elschot


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]