Problrm Highlighting

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Problrm Highlighting

khirb7
hello every body

Here is my problem :

when using highlighting solr   return only the best fragment (the most relevant  section of the document) like this
"Nicolas "Sarkozy" naît le 28 janvier 1955 dans le 17e"
but I want solr to return me not only the best section but the best sections (that I precise the number my self )
at first I thought that hl.snippet=<number> is suitable to generate best sections of text but I noticed that this parameter has no effect on the result of highlighting, even using it on per field like this:
http://localhost:8983/solr/select?indent=on&version=2.2&q=arcDoc%3Asarkozy&start=0&rows=10&fl=*%2Cscore&qt=standard&wt=standard&explainOther=&hl=on&hl.fl=arcDoc&f.arcDoc.hl.snippets=3&hl.fragsize=300

the result I want to get is for example containing the best 3 sections like this
"NicolasSarkozy naît le 28 janvier 1955 dans le 17e ... Lorsque Paul
 Sarkozyquitte le domicile conjugal en 1959 et ... Paul Sarkozy se ..."

I found in the source code of the HighlightingUtils.class and the GapFragmenter.class

/ get highlighter, and number of fragments for this field
            Highlighter highlighter = getHighlighter(query, fieldName, req);
            int numFragments = getMaxSnippets(fieldName, req);

       ......................
       ......................
frag = highlighter.getBestTextFragments(tstream, docTexts[0], false, numFragments);

but why numFragments is 1 all the time. is it a known bug or tell me if I have forgotten something in my request or any config parameter.

the other question is why there is similar classes (HighlightingUtils.class and the GapFragmenter.class) with different name and which one is used

thank you in advance.
Reply | Threaded
Open this post in threaded view
|

Re: Problrm Highlighting

hossman
: but I want solr to return me not only the best section but the best sections
: (that I precise the number my self )
: at first I thought that hl.snippet=<number> is suitable to generate best
: sections of text but I noticed that this parameter has no effect on the
: result of highlighting, even using it on per field like this:

note the documentation for hl.snippets ...

http://wiki.apache.org/solr/HighlightingParameters#head-23ecd5061bc2c86a561f85dc1303979fe614b956

        The maximum number of highlighted snippets to generate per field.
        Note: it is possible for any number of snippets from zero to this
        value to be generated.

I'm not positive, but i believe setting hl.mergeContiguous=false (new
param in trunk) may help ... but it doesn't change the fact that ify ou
only have one match, it can't geneate 3 snippets.

: the other question is why there is similar classes (HighlightingUtils.class
: and the GapFragmenter.class) with different name and which one is
: used:confused:

note that HighlightingUtils is deprecated.


-Hoss