How to know the matched field?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

How to know the matched field?

Paul Libbrecht

Hello list,

in an auto-completion task, I would like to show to the user the field  
that's been matched against the query in the found document.

Typically, my documents have multiple fields for each field-name and I  
would like the index's findings to give me the field used. How can I  
do that?

It seems to me a task of the highlighter (or of the QueryScorer?) but  
I am actually not interested into extracting the fragment found just  
to know the exact field found.

thanks in advance

paul

smime.p7s (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: How to know the matched field?

Erick Erickson
Try searching the mail archives, the searchable archive is linked to
off the Wiki. This topic has been discussed multiple times but I forget
the solutions...

Best
Erick

On Sun, Mar 22, 2009 at 4:30 PM, Paul Libbrecht <[hidden email]> wrote:

>
> Hello list,
>
> in an auto-completion task, I would like to show to the user the field
> that's been matched against the query in the found document.
>
> Typically, my documents have multiple fields for each field-name and I
> would like the index's findings to give me the field used. How can I do
> that?
>
> It seems to me a task of the highlighter (or of the QueryScorer?) but I am
> actually not interested into extracting the fragment found just to know the
> exact field found.
>
> thanks in advance
>
> paul
Reply | Threaded
Open this post in threaded view
|

Re: How to know the matched field?

Paul Libbrecht
Thanks Erick,

I browsed but no full answer yet.

The closest seems to be the explain method with which I could find the  
exact term-query or prefix-query that matched it, so I would be able  
to find the name of the field. I am still left with iterating through  
the (stored) fields and try to find the individual fields that matched.

I could also make a token-stream with all fields' contents and find  
the field (the fragment) which gets the best score with  
QueryScorer(query)?
(provided query is "rewritten" so that no prefixquery appears anymore,  
right?)

Sounds doable but please confirm this is a correct usage of  
QueryScorer, I am feeling a bit unsafe here.

paul

Le 22-mars-09 à 22:22, Erick Erickson a écrit :

> Try searching the mail archives, the searchable archive is linked to
> off the Wiki. This topic has been discussed multiple times but I  
> forget
> the solutions...
>
> On Sun, Mar 22, 2009 at 4:30 PM, Paul Libbrecht  
> <[hidden email]> wrote:
>> in an auto-completion task, I would like to show to the user the  
>> field
>> that's been matched against the query in the found document.
>>
>> Typically, my documents have multiple fields for each field-name  
>> and I
>> would like the index's findings to give me the field used. How can  
>> I do
>> that?
>>
>> It seems to me a task of the highlighter (or of the QueryScorer?)  
>> but I am
>> actually not interested into extracting the fragment found just to  
>> know the
>> exact field found.
>>
>> thanks in advance
>>
>> paul


smime.p7s (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: How to know the matched field?

Paul Libbrecht
Here's my first approach but I note that, typically, I have fields  
(which are not stored) which may be the matching field but still not  
be the one I want to return.
Typically, I have a field "names in all languages along the standard-
analyzer" which is not the one I want to "see as matched".

         query = query.rewrite(this.getReader());
         QueryScorer scorer = new QueryScorer(query);
         String found = null;
         float maxScore = 0;
         for(Field f: (List<Field>) doc.getFields()) {
             String text = f.stringValue();
             scorer.startFragment(new TextFragment(new  
StringBuffer(text),0,text.length()));
             TokenStream tok = analyzer
                     .tokenStream(f.name(),new StringReader(text));
             System.out.println("Field: " + f + ":: " +f.name() + ": "  
+ f.stringValue());
             Token t=new Token();
             while(tok!=null && (t=tok.next(t))!=null) {
                 float s = scorer.getTokenScore(t);
             }

             float score = scorer.getFragmentScore();
             if(score > maxScore) {
                 maxScore = score;
                 found = text;
             }
         }


I still don't grasp why there's TextFragment(stringbuffer) and the  
pass through the tokenizers but removing any of them breaks my unit-
test. I guess this is the whole idead behind LUCENE-1522 which I would  
up-take later.

paul


Le 23-mars-09 à 11:35, Paul Libbrecht a écrit :

> Thanks Erick,
>
> I browsed but no full answer yet.
>
> The closest seems to be the explain method with which I could find  
> the exact term-query or prefix-query that matched it, so I would be  
> able to find the name of the field. I am still left with iterating  
> through the (stored) fields and try to find the individual fields  
> that matched.
>
> I could also make a token-stream with all fields' contents and find  
> the field (the fragment) which gets the best score with  
> QueryScorer(query)?
> (provided query is "rewritten" so that no prefixquery appears  
> anymore, right?)
>
> Sounds doable but please confirm this is a correct usage of  
> QueryScorer, I am feeling a bit unsafe here.
>
> paul
>
> Le 22-mars-09 à 22:22, Erick Erickson a écrit :
>
>> Try searching the mail archives, the searchable archive is linked to
>> off the Wiki. This topic has been discussed multiple times but I  
>> forget
>> the solutions...
>>
>> On Sun, Mar 22, 2009 at 4:30 PM, Paul Libbrecht  
>> <[hidden email]> wrote:
>>> in an auto-completion task, I would like to show to the user the  
>>> field
>>> that's been matched against the query in the found document.
>>>
>>> Typically, my documents have multiple fields for each field-name  
>>> and I
>>> would like the index's findings to give me the field used. How can  
>>> I do
>>> that?
>>>
>>> It seems to me a task of the highlighter (or of the QueryScorer?)  
>>> but I am
>>> actually not interested into extracting the fragment found just to  
>>> know the
>>> exact field found.
>>>
>>> thanks in advance
>>>
>>> paul
>


smime.p7s (2K) Download Attachment