Yonik Seeley wrote:
> On 1/23/07, Walter Lewis <
[hidden email]> wrote:
>> This is quite possibly a Lucene question rather than a solr one, so my
>> apologies if you think its out of scope.
>>
>> Underlying the solr search, are some very useful Lucene constructs.
>>
>> One of the most powerful, imho, is the tilde number combination for a
>> "fuzzy" search.
>>
>> In one of my data sets
>> q=Sutherland returns 41 results
>> q=Sutherland~0.75 returns 275
>> q=Sutherland~0.70 returns 484
>> etc. all of which fits a pattern Add a first name and
>> q=(James Sutherland) returns 13
>> q=(James~0.75 Sutherland~0.75) returns 1
>> q=(James~0.70 Sutherland~0.70) returns 97
>> Qualify only one term and there is a consistent pattern. But routinely
>> qualifying two terms yields a smaller number than a string match.
>> Trying
>> q=(James~0.75 AND Sutherland~0.75) returns the same record (the
>> schema has default set to AND)
>>
>> Why would the ~0.75 *narrow* rather than broaden a search? Is there some
>> pattern in the solr syntax I'm overlooking?
>
> That's a great question... that doesn't make sense.
> Could you post your debugquery output (add debugQuery=on)?
My apologies for the delay and for the generally excessive top quoting
here. I thought it might save a bit of time to keep the alternatives
together. I should also note that I simplified the queries above.
Each ran with a searchSet constraint, which was the same value. The
"normal" queries also have a significant baggages of fields and facets,
which are also consistent across the whole set of them.
I ran the debug against the two following queries:
q=(James Sutherland) returns 13
q=(James~0.75 Sutherland~0.75) returns 1
I have attached the debug fragments below.
Walter
====
<lst name="debug">
<str name="rawquerystring">(james sutherland) searchSet:testSet</str>
<str name="querystring">(james sutherland) searchSet:testSet</str>

<str name="parsedquery">
+(+text:jame +text:sutherland) +searchSet:testSet
</str>

<str name="parsedquery_toString">
+(+text:jame +text:sutherland) +searchSet:testSet
</str>

<lst name="explain">

<str name="id=MHGL.502,internal_docid=80313">
2.2928324 = (MATCH) sum of:
2.2204013 = (MATCH) sum of:
0.444597 = (MATCH) weight(text:jame in 80313), product of:
0.46986106 = queryWeight(text:jame), product of:
4.370453 = idf(docFreq=3085)
0.107508555 = queryNorm
0.94623077 = (MATCH) fieldWeight(text:jame in 80313), product of:
1.7320508 = tf(termFreq(text:jame)=3)
4.370453 = idf(docFreq=3085)
0.125 = fieldNorm(field=text, doc=80313)
1.7758043 = (MATCH) weight(text:sutherland in 80313), product of:
0.8738745 = queryWeight(text:sutherland), product of:
8.128418 = idf(docFreq=71)
0.107508555 = queryNorm
2.0321045 = (MATCH) fieldWeight(text:sutherland in 80313), product of:
2.0 = tf(termFreq(text:sutherland)=4)
8.128418 = idf(docFreq=71)
0.125 = fieldNorm(field=text, doc=80313)
0.072431125 = (MATCH) weight(searchSet:testSet in 80313), product of:
0.124795556 = queryWeight(searchSet:testSet), product of:
1.1607965 = idf(docFreq=76441)
0.107508555 = queryNorm
0.58039826 = (MATCH) fieldWeight(searchSet:testSet in 80313),
product of:
1.0 = tf(termFreq(searchSet:testSet)=1)
1.1607965 = idf(docFreq=76441)
0.5 = fieldNorm(field=searchSet, doc=80313)
</str>

<str name="id=MHGL.503,internal_docid=80314">
2.1340907 = (MATCH) sum of:
2.0616596 = (MATCH) sum of:
0.43047923 = (MATCH) weight(text:jame in 80314), product of:
0.46986106 = queryWeight(text:jame), product of:
4.370453 = idf(docFreq=3085)
0.107508555 = queryNorm
0.91618407 = (MATCH) fieldWeight(text:jame in 80314), product of:
2.236068 = tf(termFreq(text:jame)=5)
4.370453 = idf(docFreq=3085)
0.09375 = fieldNorm(field=text, doc=80314)
1.6311804 = (MATCH) weight(text:sutherland in 80314), product of:
0.8738745 = queryWeight(text:sutherland), product of:
8.128418 = idf(docFreq=71)
0.107508555 = queryNorm
1.8666072 = (MATCH) fieldWeight(text:sutherland in 80314), product of:
2.4494898 = tf(termFreq(text:sutherland)=6)
8.128418 = idf(docFreq=71)
0.09375 = fieldNorm(field=text, doc=80314)
0.072431125 = (MATCH) weight(searchSet:testSet in 80314), product of:
0.124795556 = queryWeight(searchSet:testSet), product of:
1.1607965 = idf(docFreq=76441)
0.107508555 = queryNorm
0.58039826 = (MATCH) fieldWeight(searchSet:testSet in 80314),
product of:
1.0 = tf(termFreq(searchSet:testSet)=1)
1.1607965 = idf(docFreq=76441)
0.5 = fieldNorm(field=searchSet, doc=80314)
</str>

<str name="id=MHGL.501,internal_docid=80312">
1.5031691 = (MATCH) sum of:
1.430738 = (MATCH) sum of:
0.32086027 = (MATCH) weight(text:jame in 80312), product of:
0.46986106 = queryWeight(text:jame), product of:
4.370453 = idf(docFreq=3085)
0.107508555 = queryNorm
0.68288326 = (MATCH) fieldWeight(text:jame in 80312), product of:
1.0 = tf(termFreq(text:jame)=1)
4.370453 = idf(docFreq=3085)
0.15625 = fieldNorm(field=text, doc=80312)
1.1098777 = (MATCH) weight(text:sutherland in 80312), product of:
0.8738745 = queryWeight(text:sutherland), product of:
8.128418 = idf(docFreq=71)
0.107508555 = queryNorm
1.2700653 = (MATCH) fieldWeight(text:sutherland in 80312), product of:
1.0 = tf(termFreq(text:sutherland)=1)
8.128418 = idf(docFreq=71)
0.15625 = fieldNorm(field=text, doc=80312)
0.072431125 = (MATCH) weight(searchSet:testSet in 80312), product of:
0.124795556 = queryWeight(searchSet:testSet), product of:
1.1607965 = idf(docFreq=76441)
0.107508555 = queryNorm
0.58039826 = (MATCH) fieldWeight(searchSet:testSet in 80312),
product of:
1.0 = tf(termFreq(searchSet:testSet)=1)
1.1607965 = idf(docFreq=76441)
0.5 = fieldNorm(field=searchSet, doc=80312)
</str>

<str
name="id=
http://archeionaao.fis.utoronto.ca/cgibin/ifetch?DBRootName=ON&RecordKey=42&FieldKey=F&FilePath=E:\Documents\archeion\/ON00313f/ON00313f0000358.xml,internal_docid=12073">
0.6628341 = (MATCH) sum of:
0.5722952 = (MATCH) sum of:
0.1283441 = (MATCH) weight(text:jame in 12073), product of:
0.46986106 = queryWeight(text:jame), product of:
4.370453 = idf(docFreq=3085)
0.107508555 = queryNorm
0.2731533 = (MATCH) fieldWeight(text:jame in 12073), product of:
1.0 = tf(termFreq(text:jame)=1)
4.370453 = idf(docFreq=3085)
0.0625 = fieldNorm(field=text, doc=12073)
0.44395107 = (MATCH) weight(text:sutherland in 12073), product of:
0.8738745 = queryWeight(text:sutherland), product of:
8.128418 = idf(docFreq=71)
0.107508555 = queryNorm
0.5080261 = (MATCH) fieldWeight(text:sutherland in 12073), product of:
1.0 = tf(termFreq(text:sutherland)=1)
8.128418 = idf(docFreq=71)
0.0625 = fieldNorm(field=text, doc=12073)
0.090538904 = (MATCH) weight(searchSet:testSet in 12073), product of:
0.124795556 = queryWeight(searchSet:testSet), product of:
1.1607965 = idf(docFreq=76441)
0.107508555 = queryNorm
0.72549784 = (MATCH) fieldWeight(searchSet:testSet in 12073),
product of:
1.0 = tf(termFreq(searchSet:testSet)=1)
1.1607965 = idf(docFreq=76441)
0.625 = fieldNorm(field=searchSet, doc=12073)
</str>

<str
name="id=
http://archeionaao.fis.utoronto.ca/cgibin/ifetch?DBRootName=ON&RecordKey=42&FieldKey=F&FilePath=ON00313f/ON00313f0000358.xml,internal_docid=60185">
0.6628341 = (MATCH) sum of:
0.5722952 = (MATCH) sum of:
0.1283441 = (MATCH) weight(text:jame in 60185), product of:
0.46986106 = queryWeight(text:jame), product of:
4.370453 = idf(docFreq=3085)
0.107508555 = queryNorm
0.2731533 = (MATCH) fieldWeight(text:jame in 60185), product of:
1.0 = tf(termFreq(text:jame)=1)
4.370453 = idf(docFreq=3085)
0.0625 = fieldNorm(field=text, doc=60185)
0.44395107 = (MATCH) weight(text:sutherland in 60185), product of:
0.8738745 = queryWeight(text:sutherland), product of:
8.128418 = idf(docFreq=71)
0.107508555 = queryNorm
0.5080261 = (MATCH) fieldWeight(text:sutherland in 60185), product of:
1.0 = tf(termFreq(text:sutherland)=1)
8.128418 = idf(docFreq=71)
0.0625 = fieldNorm(field=text, doc=60185)
0.090538904 = (MATCH) weight(searchSet:testSet in 60185), product of:
0.124795556 = queryWeight(searchSet:testSet), product of:
1.1607965 = idf(docFreq=76441)
0.107508555 = queryNorm
0.72549784 = (MATCH) fieldWeight(searchSet:testSet in 60185),
product of:
1.0 = tf(termFreq(searchSet:testSet)=1)
1.1607965 = idf(docFreq=76441)
0.625 = fieldNorm(field=searchSet, doc=60185)
</str>

<str
name="id=
http://archeionaao.fis.utoronto.ca/cgibin/ifetch?DBRootName=ON&RecordKey=42&FieldKey=F&FilePath=E:\Documents\archeion\/ON00093f/ON00093f939.xml,internal_docid=10564">
0.48144954 = (MATCH) sum of:
0.39091066 = (MATCH) sum of:
0.11344123 = (MATCH) weight(text:jame in 10564), product of:
0.46986106 = queryWeight(text:jame), product of:
4.370453 = idf(docFreq=3085)
0.107508555 = queryNorm
0.24143569 = (MATCH) fieldWeight(text:jame in 10564), product of:
1.4142135 = tf(termFreq(text:jame)=2)
4.370453 = idf(docFreq=3085)
0.0390625 = fieldNorm(field=text, doc=10564)
0.27746943 = (MATCH) weight(text:sutherland in 10564), product of:
0.8738745 = queryWeight(text:sutherland), product of:
8.128418 = idf(docFreq=71)
0.107508555 = queryNorm
0.31751633 = (MATCH) fieldWeight(text:sutherland in 10564),
product of:
1.0 = tf(termFreq(text:sutherland)=1)
8.128418 = idf(docFreq=71)
0.0390625 = fieldNorm(field=text, doc=10564)
0.090538904 = (MATCH) weight(searchSet:testSet in 10564), product of:
0.124795556 = queryWeight(searchSet:testSet), product of:
1.1607965 = idf(docFreq=76441)
0.107508555 = queryNorm
0.72549784 = (MATCH) fieldWeight(searchSet:testSet in 10564),
product of:
1.0 = tf(termFreq(searchSet:testSet)=1)
1.1607965 = idf(docFreq=76441)
0.625 = fieldNorm(field=searchSet, doc=10564)
</str>

<str
name="id=
http://archeionaao.fis.utoronto.ca/cgibin/ifetch?DBRootName=ON&RecordKey=42&FieldKey=F&FilePath=ON00093f/ON00093f939.xml,internal_docid=58676">
0.48144954 = (MATCH) sum of:
0.39091066 = (MATCH) sum of:
0.11344123 = (MATCH) weight(text:jame in 58676), product of:
0.46986106 = queryWeight(text:jame), product of:
4.370453 = idf(docFreq=3085)
0.107508555 = queryNorm
0.24143569 = (MATCH) fieldWeight(text:jame in 58676), product of:
1.4142135 = tf(termFreq(text:jame)=2)
4.370453 = idf(docFreq=3085)
0.0390625 = fieldNorm(field=text, doc=58676)
0.27746943 = (MATCH) weight(text:sutherland in 58676), product of:
0.8738745 = queryWeight(text:sutherland), product of:
8.128418 = idf(docFreq=71)
0.107508555 = queryNorm
0.31751633 = (MATCH) fieldWeight(text:sutherland in 58676),
product of:
1.0 = tf(termFreq(text:sutherland)=1)
8.128418 = idf(docFreq=71)
0.0390625 = fieldNorm(field=text, doc=58676)
0.090538904 = (MATCH) weight(searchSet:testSet in 58676), product of:
0.124795556 = queryWeight(searchSet:testSet), product of:
1.1607965 = idf(docFreq=76441)
0.107508555 = queryNorm
0.72549784 = (MATCH) fieldWeight(searchSet:testSet in 58676),
product of:
1.0 = tf(termFreq(searchSet:testSet)=1)
1.1607965 = idf(docFreq=76441)
0.625 = fieldNorm(field=searchSet, doc=58676)
</str>

<str name="id=ECF.873,internal_docid=18553">
0.25359273 = (MATCH) sum of:
0.16305381 = (MATCH) sum of:
0.07981298 = (MATCH) weight(text:jame in 18553), product of:
0.46986106 = queryWeight(text:jame), product of:
4.370453 = idf(docFreq=3085)
0.107508555 = queryNorm
0.16986507 = (MATCH) fieldWeight(text:jame in 18553), product of:
3.3166249 = tf(termFreq(text:jame)=11)
4.370453 = idf(docFreq=3085)
0.01171875 = fieldNorm(field=text, doc=18553)
0.08324082 = (MATCH) weight(text:sutherland in 18553), product of:
0.8738745 = queryWeight(text:sutherland), product of:
8.128418 = idf(docFreq=71)
0.107508555 = queryNorm
0.0952549 = (MATCH) fieldWeight(text:sutherland in 18553), product of:
1.0 = tf(termFreq(text:sutherland)=1)
8.128418 = idf(docFreq=71)
0.01171875 = fieldNorm(field=text, doc=18553)
0.090538904 = (MATCH) weight(searchSet:testSet in 18553), product of:
0.124795556 = queryWeight(searchSet:testSet), product of:
1.1607965 = idf(docFreq=76441)
0.107508555 = queryNorm
0.72549784 = (MATCH) fieldWeight(searchSet:testSet in 18553),
product of:
1.0 = tf(termFreq(searchSet:testSet)=1)
1.1607965 = idf(docFreq=76441)
0.625 = fieldNorm(field=searchSet, doc=18553)
</str>

<str name="id=ECF.373,internal_docid=18055">
0.2336127 = (MATCH) sum of:
0.1430738 = (MATCH) sum of:
0.032086026 = (MATCH) weight(text:jame in 18055), product of:
0.46986106 = queryWeight(text:jame), product of:
4.370453 = idf(docFreq=3085)
0.107508555 = queryNorm
0.068288326 = (MATCH) fieldWeight(text:jame in 18055), product of:
1.0 = tf(termFreq(text:jame)=1)
4.370453 = idf(docFreq=3085)
0.015625 = fieldNorm(field=text, doc=18055)
0.11098777 = (MATCH) weight(text:sutherland in 18055), product of:
0.8738745 = queryWeight(text:sutherland), product of:
8.128418 = idf(docFreq=71)
0.107508555 = queryNorm
0.12700653 = (MATCH) fieldWeight(text:sutherland in 18055),
product of:
1.0 = tf(termFreq(text:sutherland)=1)
8.128418 = idf(docFreq=71)
0.015625 = fieldNorm(field=text, doc=18055)
0.090538904 = (MATCH) weight(searchSet:testSet in 18055), product of:
0.124795556 = queryWeight(searchSet:testSet), product of:
1.1607965 = idf(docFreq=76441)
0.107508555 = queryNorm
0.72549784 = (MATCH) fieldWeight(searchSet:testSet in 18055),
product of:
1.0 = tf(termFreq(searchSet:testSet)=1)
1.1607965 = idf(docFreq=76441)
0.625 = fieldNorm(field=searchSet, doc=18055)
</str>

<str name="id=ECF.2476,internal_docid=20148">
0.2336127 = (MATCH) sum of:
0.1430738 = (MATCH) sum of:
0.032086026 = (MATCH) weight(text:jame in 20148), product of:
0.46986106 = queryWeight(text:jame), product of:
4.370453 = idf(docFreq=3085)
0.107508555 = queryNorm
0.068288326 = (MATCH) fieldWeight(text:jame in 20148), product of:
1.0 = tf(termFreq(text:jame)=1)
4.370453 = idf(docFreq=3085)
0.015625 = fieldNorm(field=text, doc=20148)
0.11098777 = (MATCH) weight(text:sutherland in 20148), product of:
0.8738745 = queryWeight(text:sutherland), product of:
8.128418 = idf(docFreq=71)
0.107508555 = queryNorm
0.12700653 = (MATCH) fieldWeight(text:sutherland in 20148),
product of:
1.0 = tf(termFreq(text:sutherland)=1)
8.128418 = idf(docFreq=71)
0.015625 = fieldNorm(field=text, doc=20148)
0.090538904 = (MATCH) weight(searchSet:testSet in 20148), product of:
0.124795556 = queryWeight(searchSet:testSet), product of:
1.1607965 = idf(docFreq=76441)
0.107508555 = queryNorm
0.72549784 = (MATCH) fieldWeight(searchSet:testSet in 20148),
product of:
1.0 = tf(termFreq(searchSet:testSet)=1)
1.1607965 = idf(docFreq=76441)
0.625 = fieldNorm(field=searchSet, doc=20148)
</str>
</lst>
</lst>
=========

<lst name="debug">

<str name="rawquerystring">
(james~0.75 AND sutherland~0.75) searchSet:testSet
</str>

<str name="querystring">
(james~0.75 AND sutherland~0.75) searchSet:testSet
</str>

<str name="parsedquery">
+(+text:james~0.75 +text:sutherland~0.75) +searchSet:testSet
</str>

<str name="parsedquery_toString">
+(+text:james~0.75 +text:sutherland~0.75) +searchSet:testSet
</str>

<lst name="explain">

<str name="id=ECF.2227,internal_docid=19900">
0.10142321 = (MATCH) sum of:
0.04733514 = (MATCH) sum of:
0.03207182 = (MATCH) sum of:
0.03207182 = (MATCH) weight(text:rames^0.20000005 in 19900),
product of:
0.1452334 = queryWeight(text:rames^0.20000005), product of:
0.20000005 = boost
11.306472 = idf(docFreq=2)
0.06422576 = queryNorm
0.22082953 = (MATCH) fieldWeight(text:rames in 19900), product of:
1.0 = tf(termFreq(text:rames)=1)
11.306472 = idf(docFreq=2)
0.01953125 = fieldNorm(field=text, doc=19900)
0.015263321 = (MATCH) sum of:
0.015263321 = (MATCH) weight(text:netherland^0.20000005 in 19900),
product of:
0.10019111 = queryWeight(text:netherland^0.20000005), product of:
0.20000005 = boost
7.799914 = idf(docFreq=99)
0.06422576 = queryNorm
0.15234207 = (MATCH) fieldWeight(text:netherland in 19900),
product of:
1.0 = tf(termFreq(text:netherland)=1)
7.799914 = idf(docFreq=99)
0.01953125 = fieldNorm(field=text, doc=19900)
0.05408807 = (MATCH) weight(searchSet:testSet in 19900), product of:
0.07455304 = queryWeight(searchSet:testSet), product of:
1.1607965 = idf(docFreq=76441)
0.06422576 = queryNorm
0.72549784 = (MATCH) fieldWeight(searchSet:testSet in 19900),
product of:
1.0 = tf(termFreq(searchSet:testSet)=1)
1.1607965 = idf(docFreq=76441)
0.625 = fieldNorm(field=searchSet, doc=19900)
</str>
</lst>
</lst>