question about highlight field

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

question about highlight field

Xuesong Luo
Hi, there,

I have a question about how to use the highlight field(hl.fl), below is
my test result. As you can see, if I don't use hl.fl in the query, the
highlighting element in the result only shows the id information. I have
to add the field name (hl.fl=TITLE) to the query to see the field
information. Is that the correct behavior? If there are multiple fields
that could contain the search string, I have to add all of them to
hl.fl?

 

Thanks

Xuesong

 

http://localhost:8080/search/select/?q=Consultant&version=2.2&start=0&ro
ws=10&indent=on&hl=true

 

-
<http://localhost:8080/search/select/?q=senior&version=2.2&start=0&rows=
10&indent=on&hl=true&hl.fl=TITLE##>  <   <lst name="highlighting">

  <      <lst name="id1" />

  <   </lst>

 

 

http://localhost:8080/search/select/?q=Consultant&version=2.2&start=0&ro
ws=10&indent=on&hl=true&hl.fl=TITLE
<http://localhost:8080/search/select/?q=Consultant&version=2.2&start=0&r
ows=10&indent=on&hl=true&hl.fl=TITLE>

 

 

<     <lst name="highlighting">

  <      <lst name="id1" />

             <arr name="TITLE">

               <str><em>Senior</em> Event Manager</str>

       </arr>

  <      <lst name="id2" />

       </lst>

 

Reply | Threaded
Open this post in threaded view
|

Re: question about highlight field

Mike Klaas

On 1-Jun-07, at 9:37 AM, Xuesong Luo wrote:

> Hi, there,
>
> I have a question about how to use the highlight field(hl.fl),  
> below is
> my test result. As you can see, if I don't use hl.fl in the query, the
> highlighting element in the result only shows the id information. I  
> have
> to add the field name (hl.fl=TITLE) to the query to see the field
> information. Is that the correct behavior? If there are multiple  
> fields
> that could contain the search string, I have to add all of them to
> hl.fl?

Highlighting uses the following fields:

1. hl.fl, if present, will define all fields to be highlighted.  You  
can highlight fields that were not part of the query (as you  
demonstrate below)
2. if hl.fl is absent and qt=standard, the default search field is  
highlighted (set in schema.xml or df= parameter
3. if hl.fl is absent and qt=dismax, the query fields are used (qf=)

Note that every field to be highlighted must be stored.  If not, it  
will not be present in the output (perhaps that is what you are  
seeing in your example).

Finally, all terms are highlighted in all highlight fields.  If you  
query searches for different terms in different fields and you want  
this exactitude to carry forth in your highlighting, specify  
hl.requireFieldMatch=true.

-Mike
Reply | Threaded
Open this post in threaded view
|

Re: question about highlight field

Chris Hostetter-3
In reply to this post by Xuesong Luo

: I have a question about how to use the highlight field(hl.fl), below is
: my test result. As you can see, if I don't use hl.fl in the query, the
: highlighting element in the result only shows the id information. I have

according to the wiki, a blank (or missing) hl.fl should result in the
fields you searched being used for highlighting ... so either the wiki is
out of date, or there is a bug in the highlighting ... i'm not sure which.

(hopefully someone who knows more about highlighting can chime in)

Demo using example schema and explicitly querying a field that is
stored...
http://localhost:8983/solr/select?q=features%3Asolr&hl=on


-Hoss

Reply | Threaded
Open this post in threaded view
|

RE: question about highlight field

Xuesong Luo
In reply to this post by Xuesong Luo
Mike,

Thanks for the information, You are right, my problem is my default
search field (searchall) is not stored. The searchall field is a multi
valued field(a combination of TITLE and a few other fields).I have a
separate field TITLE, which is stored, so I thought that field should be
highlighted automatically if hl.fl is absent, I didn't know the default
search field will be used until now. After I store the searchall field,
I saw the TITLE information in searchall field is displayed in the
highlighting element when hl.fl is absent. (See example below)

- <lst name="highlighting">

  - <lst name="id1">

    - <arr name="searchall">

        <str><em>Senior</em> Event Manager</str>

      </arr>

    </lst>

  </lst>

 

So if I need to search a string in field f1, f2, f3 and highlight them
in the response, I have to append hl.fl=f1,f2,f3 to my query. Is this
the only solution?  I thought of using searchall field, but the problem
is the highlight element doesn't tell which value belongs to which
field, as you can see in the example above, I can't tell the Senior
Event Manager is from TITLE or other fields.

 

Thanks for all the help

Xuesong

 

 

-----Original Message-----

From: Mike Klaas [mailto:[hidden email]]

Sent: Friday, June 01, 2007 11:43 AM

To: [hidden email]

Subject: Re: question about highlight field

 

 

On 1-Jun-07, at 9:37 AM, Xuesong Luo wrote:

 

> Hi, there,

>

> I have a question about how to use the highlight field(hl.fl),  

> below is

> my test result. As you can see, if I don't use hl.fl in the query, the

> highlighting element in the result only shows the id information. I  

> have

> to add the field name (hl.fl=TITLE) to the query to see the field

> information. Is that the correct behavior? If there are multiple  

> fields

> that could contain the search string, I have to add all of them to

> hl.fl?

 

Highlighting uses the following fields:

 

1. hl.fl, if present, will define all fields to be highlighted.  You  

can highlight fields that were not part of the query (as you  

demonstrate below)

2. if hl.fl is absent and qt=standard, the default search field is  

highlighted (set in schema.xml or df= parameter

3. if hl.fl is absent and qt=dismax, the query fields are used (qf=)

 

Note that every field to be highlighted must be stored.  If not, it  

will not be present in the output (perhaps that is what you are  

seeing in your example).

 

Finally, all terms are highlighted in all highlight fields.  If you  

query searches for different terms in different fields and you want  

this exactitude to carry forth in your highlighting, specify  

hl.requireFieldMatch=true.

 

-Mike

 

Reply | Threaded
Open this post in threaded view
|

RE: question about highlight field

Chris Hostetter-3
:
: So if I need to search a string in field f1, f2, f3 and highlight them
: in the response, I have to append hl.fl=f1,f2,f3 to my query. Is this
: the only solution?  I thought of using searchall field, but the problem
: is the highlight element doesn't tell which value belongs to which

that's because you are highlighting your "searchall" field ... you can
search one field and highlight differnet fields -- but yes, you have to
list the fields you want to highlight (Solr can only do some much to
"guess" which fields to highlight, and in your case it's not gussing what
you want it to)




-Hoss

Reply | Threaded
Open this post in threaded view
|

Re: question about highlight field

Mike Klaas
In reply to this post by Xuesong Luo
On 4-Jun-07, at 9:56 AM, Xuesong Luo wrote:

>
> So if I need to search a string in field f1, f2, f3 and highlight them
> in the response, I have to append hl.fl=f1,f2,f3 to my query. Is this
> the only solution?  I thought of using searchall field, but the  
> problem
> is the highlight element doesn't tell which value belongs to which
> field, as you can see in the example above, I can't tell the Senior
> Event Manager is from TITLE or other fields.

As Chris mentioned, I'm not sure how Solr could "know" that you want  
to highlight those fields, given that you aren't even searching them.

One option is to search those fields directly, using dismax.  In that  
case, the highlight fields will be picked up automatically.

-Mike
Reply | Threaded
Open this post in threaded view
|

RE: question about highlight field

Xuesong Luo
In reply to this post by Xuesong Luo
Chris,
Thanks for the reply. I'm curious why we want to search one field but
highlight different fields? Doesn't it make more sense to only highlight
the query fields? In my example, if I search f1, f2, f3, most likely I
only want to the searching words in those fields to be highlighted. Of
course I can use hl.fl, but I think it make more sense for solr to
automatically highlight those fields(rather than the default search
field) for us.

Thanks
Xuesong


-----Original Message-----
From: Chris Hostetter [mailto:[hidden email]]
Sent: Monday, June 04, 2007 11:33 AM
To: [hidden email]
Subject: RE: question about highlight field

:
: So if I need to search a string in field f1, f2, f3 and highlight them
: in the response, I have to append hl.fl=f1,f2,f3 to my query. Is this
: the only solution?  I thought of using searchall field, but the
problem
: is the highlight element doesn't tell which value belongs to which

that's because you are highlighting your "searchall" field ... you can
search one field and highlight differnet fields -- but yes, you have to
list the fields you want to highlight (Solr can only do some much to
"guess" which fields to highlight, and in your case it's not gussing
what
you want it to)




-Hoss


Reply | Threaded
Open this post in threaded view
|

RE: question about highlight field

Xuesong Luo
In reply to this post by Xuesong Luo
Thanks Mike, I tried using dismax and it seems working. The only problem
is I could not use wildcard in the query string if I specify qt=dismax.

I have a default search field called TITLE(TextField),
This one returns all engineer whose TITLE starts with engin:  /?q=engin*
This one does not return anything:   /?q=engin*&qt=dismax

Do you know what is the problem?

Thanks
Xuesong

-----Original Message-----
From: Mike Klaas [mailto:[hidden email]]
Sent: Monday, June 04, 2007 11:46 AM
To: [hidden email]
Subject: Re: question about highlight field

On 4-Jun-07, at 9:56 AM, Xuesong Luo wrote:

>
> So if I need to search a string in field f1, f2, f3 and highlight them
> in the response, I have to append hl.fl=f1,f2,f3 to my query. Is this
> the only solution?  I thought of using searchall field, but the  
> problem
> is the highlight element doesn't tell which value belongs to which
> field, as you can see in the example above, I can't tell the Senior
> Event Manager is from TITLE or other fields.

As Chris mentioned, I'm not sure how Solr could "know" that you want  
to highlight those fields, given that you aren't even searching them.

One option is to search those fields directly, using dismax.  In that  
case, the highlight fields will be picked up automatically.

-Mike


Reply | Threaded
Open this post in threaded view
|

RE: question about highlight field

Chris Hostetter-3
In reply to this post by Xuesong Luo

: Thanks for the reply. I'm curious why we want to search one field but
: highlight different fields? Doesn't it make more sense to only highlight

consider a typical use case: you have an index of articles with
fields for the title, description, and body of the article.  you search
all of them, but on the search results page you only highlight matches in
the title and description (maybe you have a "cached" view of each article
where you display the stored contents of the article body with
highlighting)

: the query fields? In my example, if I search f1, f2, f3, most likely I
: course I can use hl.fl, but I think it make more sense for solr to
: automatically highlight those fields(rather than the default search
: field) for us.

that wasn't actually your example .. you weren't searching across fields
f1, f2 and f3; you were searching for words in the default field
("searchall") that happened to be made by combining the text from f1, f2,
and f3 using copyField.  As Mike pointed out, if you use dismax to
*really* search for your input in f1, f2, and f3 (using the qf param) then
Solr will highlight those fields for you.

you may be wondering about what happens when you do a search like...
   http://localhost:8983/solr/select?q=features%3Asolr&hl=on
...ie: your query string explicitly looks for solr in the field features.

Solr doesn't "guess" that you want to highlight the features field in this
case, it could -- but it would be a bad idea.  The hl.fl default
"guessing" logic is independent of the fields that appear in the query
string ... if you were relying on SOlr to highlight your default search
field for you so you could display it in your application, you wouldn't be
very happy if your application broke because someone happend to do a field
specific query.

-Hoss

Reply | Threaded
Open this post in threaded view
|

RE: question about highlight field

Chris Hostetter-3
In reply to this post by Xuesong Luo

: is I could not use wildcard in the query string if I specify qt=dismax.

the dismax handler uses a much more simplified query syntax then the
standard request handler.  Only +, -, and " are special characters so
wildcards are not supported.



-Hoss

Reply | Threaded
Open this post in threaded view
|

RE: question about highlight field

Xuesong Luo
In reply to this post by Xuesong Luo
Chris,
Thanks for your reply, I got what you said. In my case, I have the
following requirements:
1. search on different fields
2. highlight the query string in searching fields, not default search
field
3. Use wildcard

I think the only option I have is to use standard request handler and
specify which field I want to search and add the same field to hl.fl,
something similar to ?q=TITLE:consult*&hl=on&hl.fl=TITLE, right?

Thanks
Xuesong

-----Original Message-----
From: Chris Hostetter [mailto:[hidden email]]
Sent: Monday, June 04, 2007 11:08 PM
To: [hidden email]
Subject: RE: question about highlight field


: Thanks for the reply. I'm curious why we want to search one field but
: highlight different fields? Doesn't it make more sense to only
highlight

consider a typical use case: you have an index of articles with
fields for the title, description, and body of the article.  you search
all of them, but on the search results page you only highlight matches
in
the title and description (maybe you have a "cached" view of each
article
where you display the stored contents of the article body with
highlighting)

: the query fields? In my example, if I search f1, f2, f3, most likely I
: course I can use hl.fl, but I think it make more sense for solr to
: automatically highlight those fields(rather than the default search
: field) for us.

that wasn't actually your example .. you weren't searching across fields
f1, f2 and f3; you were searching for words in the default field
("searchall") that happened to be made by combining the text from f1,
f2,
and f3 using copyField.  As Mike pointed out, if you use dismax to
*really* search for your input in f1, f2, and f3 (using the qf param)
then
Solr will highlight those fields for you.

you may be wondering about what happens when you do a search like...
   http://localhost:8983/solr/select?q=features%3Asolr&hl=on
...ie: your query string explicitly looks for solr in the field
features.

Solr doesn't "guess" that you want to highlight the features field in
this
case, it could -- but it would be a bad idea.  The hl.fl default
"guessing" logic is independent of the fields that appear in the query
string ... if you were relying on SOlr to highlight your default search
field for you so you could display it in your application, you wouldn't
be
very happy if your application broke because someone happend to do a
field
specific query.

-Hoss


Reply | Threaded
Open this post in threaded view
|

RE: question about highlight field

Chris Hostetter-3

: 1. search on different fields
: 2. highlight the query string in searching fields, not default search

: I think the only option I have is to use standard request handler and
: specify which field I want to search and add the same field to hl.fl,
: something similar to ?q=TITLE:consult*&hl=on&hl.fl=TITLE, right?

pretty much, if you want to ensure that your highlighting only aplies to
things that actauly result in query matches, set hl.requireFieldMatch=true
that way in queries like this...

   ?q=DEK:albino+TITLE:elephant&hl=on&hl.fl=TITLE,DEK

...the word elephant won't be highlighted in the DEK field, and the word
albino won't be highlighted in the TITLE field.




-Hoss

Reply | Threaded
Open this post in threaded view
|

RE: question about highlight field

Xuesong Luo
In reply to this post by Xuesong Luo
Good point, I haven't thought about it. It makes sense to use
requireFieldMatch in my case.

One more question about using wildcard. I found if wildcard is used in
the query, the highlight elements only shows unique id, it won't display
the field information(See below, the arr section in blue is returned).
Is this the designed behavior?

 

 

?q=TITLE:consult*&hl=on&hl.fl=TITLE

 

<lst name="highlighting">

 <lst name="id1">

  <arr name="TITLE">

    <str><em>Consult</em>ant</str>

  </arr>

 </lst>

</lst>

 

 

Thanks

Xuesong

 

-----Original Message-----
From: Chris Hostetter [mailto:[hidden email]]
Sent: Tuesday, June 05, 2007 12:02 PM
To: [hidden email]


Subject: RE: question about highlight field

 

 

: 1. search on different fields

: 2. highlight the query string in searching fields, not default search

 

: I think the only option I have is to use standard request handler and

: specify which field I want to search and add the same field to hl.fl,

: something similar to ?q=TITLE:consult*&hl=on&hl.fl=TITLE, right?

 

pretty much, if you want to ensure that your highlighting only aplies to

things that actauly result in query matches, set
hl.requireFieldMatch=true

that way in queries like this...

 

   ?q=DEK:albino+TITLE:elephant&hl=on&hl.fl=TITLE,DEK

 

...the word elephant won't be highlighted in the DEK field, and the word

albino won't be highlighted in the TITLE field.

 

 

 

 

-Hoss

 

Reply | Threaded
Open this post in threaded view
|

RE: question about highlight field

Chris Hostetter-3

: One more question about using wildcard. I found if wildcard is used in
: the query, the highlight elements only shows unique id, it won't display

: <lst name="highlighting">
:  <lst name="id1">
:   <arr name="TITLE">
:     <str><em>Consult</em>ant</str>

your description of the problem doesn't seem to match what you've pasted
... it looks like it's highlighting just the prefix from the query.

You're using Solr 1.1 right?

Unfortunately, i think you are damned if you do, damned if you don't ...
in Solr 1.1, highlighting used the info from the raw query to do
highlighting, hence in your query for consult* it would highlight
the Consult part of Consultant even though the prefix query was matchign
the whole word.  In the trunk (soon to be Solr 1.2) Mike fixed that so the
query is "rewritten" to it's expanded form before highlighting is done ...
this works great for true wild card queries (ie: cons*t* or cons?lt*) but
for prefix queries Solr has an optimization ofr Prefix queries (ie:
consult*) to reduce the likely hood of Solr crashing if the prefix matches
a lot of terms ... unfortunately this breaks highlighting of prefix
queries, and no one has implemented a solution yet...

https://issues.apache.org/jira/browse/SOLR-195




-Hoss

Reply | Threaded
Open this post in threaded view
|

RE: question about highlight field

Xuesong Luo
In reply to this post by Xuesong Luo
Yes, I'm using 1.1. The example in my last email is an expected result,
not the real result. Indeed I didn't see the arr element in the
highlighting element when either prefix wildcard or true wildcard query
is used.
I just tried nightly build, as you said, it works great except for
prefix wildcard.

Thanks for your help!
Xuesong


-----Original Message-----
From: Chris Hostetter [mailto:[hidden email]]
Sent: Tuesday, June 05, 2007 10:16 PM
To: [hidden email]
Subject: RE: question about highlight field


: One more question about using wildcard. I found if wildcard is used in
: the query, the highlight elements only shows unique id, it won't
display

: <lst name="highlighting">
:  <lst name="id1">
:   <arr name="TITLE">
:     <str><em>Consult</em>ant</str>

your description of the problem doesn't seem to match what you've pasted
... it looks like it's highlighting just the prefix from the query.

You're using Solr 1.1 right?

Unfortunately, i think you are damned if you do, damned if you don't ...
in Solr 1.1, highlighting used the info from the raw query to do
highlighting, hence in your query for consult* it would highlight
the Consult part of Consultant even though the prefix query was matchign
the whole word.  In the trunk (soon to be Solr 1.2) Mike fixed that so
the
query is "rewritten" to it's expanded form before highlighting is done
...
this works great for true wild card queries (ie: cons*t* or cons?lt*)
but
for prefix queries Solr has an optimization ofr Prefix queries (ie:
consult*) to reduce the likely hood of Solr crashing if the prefix
matches
a lot of terms ... unfortunately this breaks highlighting of prefix
queries, and no one has implemented a solution yet...

https://issues.apache.org/jira/browse/SOLR-195




-Hoss


Reply | Threaded
Open this post in threaded view
|

RE: question about highlight field

Chris Hostetter-3

: Yes, I'm using 1.1. The example in my last email is an expected result,
: not the real result. Indeed I didn't see the arr element in the
: highlighting element when either prefix wildcard or true wildcard query

Hmmm... yes, i'm sorry i wasn't thinking clearly -- that makes sense since
in 1.1 the queries weren't being rewritten at all and so extractTerms
wouldn't work.



-Hoss