[jira] Created: (LUCENE-533) SpanQuery scoring: SpanWeight lacks a recursive traversal of the query tree

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (LUCENE-533) SpanQuery scoring: SpanWeight lacks a recursive traversal of the query tree

JIRA jira@apache.org
SpanQuery scoring: SpanWeight lacks a recursive traversal of the query tree
---------------------------------------------------------------------------

         Key: LUCENE-533
         URL: http://issues.apache.org/jira/browse/LUCENE-533
     Project: Lucene - Java
        Type: Bug
  Components: Search  
    Versions: 1.9    
    Reporter: Vincent Le Maout


I found the computing of weights to be somewhat different according to the query type (BooleanQuery versus SpanQuery) :

org.apache.lucene.search.BooleanQuery.BooleanWeight :

public BooleanWeight(Searcher searcher)
     throws IOException {
     this.similarity = getSimilarity(searcher);
     for (int i = 0 ; i < clauses.size(); i++) {
       BooleanClause c = (BooleanClause)clauses.elementAt(i);
       weights.add(c.getQuery().createWeight(searcher));
     }
   }

which looks like a recursive descent through the tree, taking into account the weights of all the nodes, whereas :

org.apache.lucene.search.spans.SpanWeight :

public SpanWeight(SpanQuery query, Searcher searcher)
   throws IOException {
   this.similarity = query.getSimilarity(searcher);
   this.query = query;
   this.terms = query.getTerms();

   idf = this.query.getSimilarity(searcher).idf(terms, searcher);
 }

lacks any traversal and according to what I have understood so far from the rest
of the code, only takes into account the boost of the tree root in SumOfSquareWeights(),
which is consistent with the resulting scores not considering the boost of the tree
leaves.

vintz

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-533) SpanQuery scoring: SpanWeight lacks a recursive traversal of the query tree

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660579#action_12660579 ]

Mark Miller commented on LUCENE-533:
------------------------------------

Hoping we can get this one worked out Vincent. It is odd that the boost are ignored, and certainly seems incorrect to me.

I worked on a fix a few months ago, but unfortunately, things complicated fairly quickly, and I think you end up needing something similar to what BooleanQuery does to compute the tree boosts, which adds a lot of complication and could be a lot of performance loss (Span queries are already generally slower than standard queries).

Hope to get this resolved at some point in the future though, just not as simple as I would have hoped. Possibly why it was punted on to begin with.

> SpanQuery scoring: SpanWeight lacks a recursive traversal of the query tree
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-533
>                 URL: https://issues.apache.org/jira/browse/LUCENE-533
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>    Affects Versions: 1.9
>            Reporter: Vincent Le Maout
>            Priority: Minor
>
> I found the computing of weights to be somewhat different according to the query type (BooleanQuery versus SpanQuery) :
> org.apache.lucene.search.BooleanQuery.BooleanWeight :
> public BooleanWeight(Searcher searcher)
>      throws IOException {
>      this.similarity = getSimilarity(searcher);
>      for (int i = 0 ; i < clauses.size(); i++) {
>        BooleanClause c = (BooleanClause)clauses.elementAt(i);
>        weights.add(c.getQuery().createWeight(searcher));
>      }
>    }
> which looks like a recursive descent through the tree, taking into account the weights of all the nodes, whereas :
> org.apache.lucene.search.spans.SpanWeight :
> public SpanWeight(SpanQuery query, Searcher searcher)
>    throws IOException {
>    this.similarity = query.getSimilarity(searcher);
>    this.query = query;
>    this.terms = query.getTerms();
>    idf = this.query.getSimilarity(searcher).idf(terms, searcher);
>  }
> lacks any traversal and according to what I have understood so far from the rest
> of the code, only takes into account the boost of the tree root in SumOfSquareWeights(),
> which is consistent with the resulting scores not considering the boost of the tree
> leaves.
> vintz

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-533) SpanQuery scoring: SpanWeight lacks a recursive traversal of the query tree

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12741730#action_12741730 ]

Paul Elschot commented on LUCENE-533:
-------------------------------------

One problem here is that the Spans interface does not have a property for a weight value.

So one way to start this could be to deprecate Spans and to define something like this:
{code}
public abstract class WeightedSpans implements Spans {
  ... abstract methods as in Spans interface;

  public float getValue()
  // implement getValue here to allow WeightedSpans to replace Spans everywhere
  { return 1.0; }
}
{code}



> SpanQuery scoring: SpanWeight lacks a recursive traversal of the query tree
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-533
>                 URL: https://issues.apache.org/jira/browse/LUCENE-533
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>    Affects Versions: 1.9
>            Reporter: Vincent Le Maout
>            Priority: Minor
>
> I found the computing of weights to be somewhat different according to the query type (BooleanQuery versus SpanQuery) :
> org.apache.lucene.search.BooleanQuery.BooleanWeight :
> public BooleanWeight(Searcher searcher)
>      throws IOException {
>      this.similarity = getSimilarity(searcher);
>      for (int i = 0 ; i < clauses.size(); i++) {
>        BooleanClause c = (BooleanClause)clauses.elementAt(i);
>        weights.add(c.getQuery().createWeight(searcher));
>      }
>    }
> which looks like a recursive descent through the tree, taking into account the weights of all the nodes, whereas :
> org.apache.lucene.search.spans.SpanWeight :
> public SpanWeight(SpanQuery query, Searcher searcher)
>    throws IOException {
>    this.similarity = query.getSimilarity(searcher);
>    this.query = query;
>    this.terms = query.getTerms();
>    idf = this.query.getSimilarity(searcher).idf(terms, searcher);
>  }
> lacks any traversal and according to what I have understood so far from the rest
> of the code, only takes into account the boost of the tree root in SumOfSquareWeights(),
> which is consistent with the resulting scores not considering the boost of the tree
> leaves.
> vintz

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-533) SpanQuery scoring: SpanWeight lacks a recursive traversal of the query tree

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742045#action_12742045 ]

Mark Miller commented on LUCENE-533:
------------------------------------

Paul:

Spans is breaking back compat in this release. This is the opportunity to change anything.

Do you think we should add something like this? Do you think it might make sense to make Spans abstract?

Its an interface thats gaining methods this release, so back compat is gone anyway (Spans back compat was lost in the last release, and we are correcting things to a degree in 2.9). If we can help pave the way for the future in a reasonably small amount of time - this is likely the best opportunity.

> SpanQuery scoring: SpanWeight lacks a recursive traversal of the query tree
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-533
>                 URL: https://issues.apache.org/jira/browse/LUCENE-533
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>    Affects Versions: 1.9
>            Reporter: Vincent Le Maout
>            Priority: Minor
>
> I found the computing of weights to be somewhat different according to the query type (BooleanQuery versus SpanQuery) :
> org.apache.lucene.search.BooleanQuery.BooleanWeight :
> public BooleanWeight(Searcher searcher)
>      throws IOException {
>      this.similarity = getSimilarity(searcher);
>      for (int i = 0 ; i < clauses.size(); i++) {
>        BooleanClause c = (BooleanClause)clauses.elementAt(i);
>        weights.add(c.getQuery().createWeight(searcher));
>      }
>    }
> which looks like a recursive descent through the tree, taking into account the weights of all the nodes, whereas :
> org.apache.lucene.search.spans.SpanWeight :
> public SpanWeight(SpanQuery query, Searcher searcher)
>    throws IOException {
>    this.similarity = query.getSimilarity(searcher);
>    this.query = query;
>    this.terms = query.getTerms();
>    idf = this.query.getSimilarity(searcher).idf(terms, searcher);
>  }
> lacks any traversal and according to what I have understood so far from the rest
> of the code, only takes into account the boost of the tree root in SumOfSquareWeights(),
> which is consistent with the resulting scores not considering the boost of the tree
> leaves.
> vintz

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-533) SpanQuery scoring: SpanWeight lacks a recursive traversal of the query tree

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742249#action_12742249 ]

Paul Elschot commented on LUCENE-533:
-------------------------------------

I see I missed the introduction of payloads into Spans. As back compat is broken anyway, one might as well get rid of the Spans interface completely and make Spans an abstract class.
Since it is only the interface that is in the way of changes, any way to get rid of the Spans as an interface is ok with me.

Payloads can be yet another way to introduce a (term/spans) weight, so one might subclass these from WeightedSpans:
Spans -> WeightedSpans -> PayloadSpans.
That would also allow to use WeightedSpans inside an object hierarchy for scoring nested span queries, and to use PayloadSpans as a leafs.

Scoring nested span queries is not trivial, and allowing a weight on each spans does not make it simpler, but at least it would allow span queries to behave more like boolean queries.

> SpanQuery scoring: SpanWeight lacks a recursive traversal of the query tree
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-533
>                 URL: https://issues.apache.org/jira/browse/LUCENE-533
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>    Affects Versions: 1.9
>            Reporter: Vincent Le Maout
>            Priority: Minor
>
> I found the computing of weights to be somewhat different according to the query type (BooleanQuery versus SpanQuery) :
> org.apache.lucene.search.BooleanQuery.BooleanWeight :
> public BooleanWeight(Searcher searcher)
>      throws IOException {
>      this.similarity = getSimilarity(searcher);
>      for (int i = 0 ; i < clauses.size(); i++) {
>        BooleanClause c = (BooleanClause)clauses.elementAt(i);
>        weights.add(c.getQuery().createWeight(searcher));
>      }
>    }
> which looks like a recursive descent through the tree, taking into account the weights of all the nodes, whereas :
> org.apache.lucene.search.spans.SpanWeight :
> public SpanWeight(SpanQuery query, Searcher searcher)
>    throws IOException {
>    this.similarity = query.getSimilarity(searcher);
>    this.query = query;
>    this.terms = query.getTerms();
>    idf = this.query.getSimilarity(searcher).idf(terms, searcher);
>  }
> lacks any traversal and according to what I have understood so far from the rest
> of the code, only takes into account the boost of the tree root in SumOfSquareWeights(),
> which is consistent with the resulting scores not considering the boost of the tree
> leaves.
> vintz

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]