Highlighting question

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Highlighting question

Stephen Green-2
Hi, folks.  I'm trying to get a very simple example working with Solr
highlighting.  I have a default search field (called, unsurprisingly
"default-search-field") with text in it and I want query terms to be
highlighted in that field when I do a search.

I'm using an up to date (as of this evening) checkout of 1.4.  My
solrconfig.xml contains the following highlighting element:

    <highlighting>
   <!-- Configure the standard fragmenter -->
   <!-- This could most likely be commented out in the "default" case -->
        <fragmenter name="gap"
class="org.apache.solr.highlight.GapFragmenter" default="true">
            <lst name="defaults">
                <int name="hl.fragsize">200</int>
            </lst>
        </fragmenter>

   <!-- A regular-expression-based fragmenter (f.i., for sentence
extraction) -->
        <fragmenter name="regex"
class="org.apache.solr.highlight.RegexFragmenter">
            <lst name="defaults">
      <!-- slightly smaller fragsizes work better because of slop -->
                <int name="hl.fragsize">170</int>
      <!-- allow 50% slop on fragment sizes -->
                <float name="hl.regex.slop">0.5</float>
      <!-- a basic sentence pattern -->
                <str name="hl.regex.pattern">[-\w ,/\n\"']{20,200}</str>
            </lst>
        </fragmenter>

   <!-- Configure the standard formatter -->
        <formatter name="html"
class="org.apache.solr.highlight.HtmlFormatter" default="true">
            <lst name="defaults">
                <str name="hl.simple.pre"><![CDATA[<span class="ht">]]>
                </str>
                <str name="hl.simple.post"><![CDATA[</span>]]>
                </str>
            </lst>
        </formatter>
    </highlighting>

I'm using SolrJ to talk to the Solr server.  Here's the code to do a
query, where qs is the query string.

            SolrQuery q = new SolrQuery(qs);
            q.setQueryType("dismax");
            q.setHighlight(true);
            q.setHighlightFragsize(250);
            q.set("hl.formatter", "html");
            q.set("hl.fragmenter", "regex");
            q.setFields("default-search-field", "key");
            QueryResponse resp = solr.query(q);

I've set up the dismax handler in solrconfig.xml to search the
default-search-field.

The Solr server logs the following request for this:

INFO: [] webapp=/solr path=/select
params={hl.fragsize=250&fl=default-search-field,key&hl.fragmenter=regex&q=garbage&hl.formatter=html&qt=dismax&wt=javabin&hl=true&version=1}
hits=953 status=0 QTime=39

which looks about right to me, but I don't see any highlighting in the results.

I'm clearly missing something pretty fundamental here, and any help
would be appreciated.

Steve Green
Reply | Threaded
Open this post in threaded view
|

Re: Highlighting question

Erik Hatcher
Is default-search-field stored (as specified in schema.xml)?

        Erik


On Aug 3, 2009, at 8:05 PM, Stephen Green wrote:

> Hi, folks.  I'm trying to get a very simple example working with Solr
> highlighting.  I have a default search field (called, unsurprisingly
> "default-search-field") with text in it and I want query terms to be
> highlighted in that field when I do a search.
>
> I'm using an up to date (as of this evening) checkout of 1.4.  My
> solrconfig.xml contains the following highlighting element:
>
>    <highlighting>
>   <!-- Configure the standard fragmenter -->
>   <!-- This could most likely be commented out in the "default" case  
> -->
>        <fragmenter name="gap"
> class="org.apache.solr.highlight.GapFragmenter" default="true">
>            <lst name="defaults">
>                <int name="hl.fragsize">200</int>
>            </lst>
>        </fragmenter>
>
>   <!-- A regular-expression-based fragmenter (f.i., for sentence
> extraction) -->
>        <fragmenter name="regex"
> class="org.apache.solr.highlight.RegexFragmenter">
>            <lst name="defaults">
>      <!-- slightly smaller fragsizes work better because of slop -->
>                <int name="hl.fragsize">170</int>
>      <!-- allow 50% slop on fragment sizes -->
>                <float name="hl.regex.slop">0.5</float>
>      <!-- a basic sentence pattern -->
>                <str name="hl.regex.pattern">[-\w ,/\n\"']{20,200}</
> str>
>            </lst>
>        </fragmenter>
>
>   <!-- Configure the standard formatter -->
>        <formatter name="html"
> class="org.apache.solr.highlight.HtmlFormatter" default="true">
>            <lst name="defaults">
>                <str name="hl.simple.pre"><![CDATA[<span class="ht">]]>
>                </str>
>                <str name="hl.simple.post"><![CDATA[</span>]]>
>                </str>
>            </lst>
>        </formatter>
>    </highlighting>
>
> I'm using SolrJ to talk to the Solr server.  Here's the code to do a
> query, where qs is the query string.
>
>            SolrQuery q = new SolrQuery(qs);
>            q.setQueryType("dismax");
>            q.setHighlight(true);
>            q.setHighlightFragsize(250);
>            q.set("hl.formatter", "html");
>            q.set("hl.fragmenter", "regex");
>            q.setFields("default-search-field", "key");
>            QueryResponse resp = solr.query(q);
>
> I've set up the dismax handler in solrconfig.xml to search the
> default-search-field.
>
> The Solr server logs the following request for this:
>
> INFO: [] webapp=/solr path=/select
> params={hl.fragsize=250&fl=default-search-
> field
> ,key
> &hl
> .fragmenter
> =
> regex
> &q=garbage&hl.formatter=html&qt=dismax&wt=javabin&hl=true&version=1}
> hits=953 status=0 QTime=39
>
> which looks about right to me, but I don't see any highlighting in  
> the results.
>
> I'm clearly missing something pretty fundamental here, and any help
> would be appreciated.
>
> Steve Green

Reply | Threaded
Open this post in threaded view
|

Re: Highlighting question

Stephen Green-2
On Mon, Aug 3, 2009 at 8:34 PM, Erik Hatcher<[hidden email]> wrote:
> Is default-search-field stored (as specified in schema.xml)?

Yep:

    <field name="default-search-field"
        type="html" indexed="true" stored="true"
        termVectors="true" multiValued="true"/>

While trying to figure this out, I went and did ant run-examples to
bring up the example in Jetty (I'm using Tomcat), and tried a couple
of queries in the resulting /solr/admin, and they don't appear to be
highlighted either.

Steve Green
Reply | Threaded
Open this post in threaded view
|

Re: Highlighting question

Stephen Green-2
On Mon, Aug 3, 2009 at 8:38 PM, Stephen Green<[hidden email]> wrote:

> On Mon, Aug 3, 2009 at 8:34 PM, Erik Hatcher<[hidden email]> wrote:
>> Is default-search-field stored (as specified in schema.xml)?
>
> Yep:
>
>    <field name="default-search-field"
>        type="html" indexed="true" stored="true"
>        termVectors="true" multiValued="true"/>
>
> While trying to figure this out, I went and did ant run-examples to
> bring up the example in Jetty (I'm using Tomcat), and tried a couple
> of queries in the resulting /solr/admin, and they don't appear to be
> highlighted either.

Actually, if I check the highlighting box in the "full interface"
query option in the Solr admin panel, I notice that an element like:

<lst name="highlighting">
<lst name="SOLR1000"/>
</lst>

Is added to the end of the results that are returned.

Oh, and thanks for the fast response, Erik :-)

Steve Green
Reply | Threaded
Open this post in threaded view
|

Re: Highlighting question

Stephen Green-2
On Mon, Aug 3, 2009 at 8:45 PM, Stephen Green<[hidden email]> wrote:

> On Mon, Aug 3, 2009 at 8:38 PM, Stephen Green<[hidden email]> wrote:
>> On Mon, Aug 3, 2009 at 8:34 PM, Erik Hatcher<[hidden email]> wrote:
>>> Is default-search-field stored (as specified in schema.xml)?
>>
>> Yep:
>>
>>    <field name="default-search-field"
>>        type="html" indexed="true" stored="true"
>>        termVectors="true" multiValued="true"/>
>>
>> While trying to figure this out, I went and did ant run-examples to
>> bring up the example in Jetty (I'm using Tomcat), and tried a couple
>> of queries in the resulting /solr/admin, and they don't appear to be
>> highlighted either.
>
> Actually, if I check the highlighting box in the "full interface"
> query option in the Solr admin panel, I notice that an element like:
>
> <lst name="highlighting">
> <lst name="SOLR1000"/>
> </lst>
>
> Is added to the end of the results that are returned.
>
> Oh, and thanks for the fast response, Erik :-)

OK, I think I might just be dumb.  The query response has a set of
highlighted things with references to the docs that were highlighted.
There's enough information there to create the highlighted
representation that I want.

Duh.

Steve Green