Highlight in a response writer, bad practice ?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Highlight in a response writer, bad practice ?

Frédéric Glorieux
Hi all,

Tests are going very well with sorl. I'm working for an academic
project, with not a lot of users, but with high demands, this will
explain the background of my question. For linguistic activities,
searching is a goal by itself, retrieving a document may be second.
That's why it's common to serve thousands of "highlighted snippets" from
results (in solr terms), like a "concordancer".

In such cases, it seems memory expensive to prepare really big snippets
lists from StandardRequestHandler. I'm beginning to make it work from a
ResponseWriter (thanks for all needed code already in
HighlightingUtils), so that snippets are directly written to the
response, without storing.

Before working too much on this code, is it good practice ? Did I miss
an important reason ? I understand the choice of StandardRequestHandler
for a normal usage of a search engine (paged results), to avoid code
replication for each ResponseWriter (XML, Json...). Am I wrong ?

If solr/lucene gurus have time to listen, I will also need some infos
about highlighter, for another post.

--
Frédéric Glorieux
École nationale des chartes
direction des nouvelles technologies et de l'informatique
Reply | Threaded
Open this post in threaded view
|

Re: Highlight in a response writer, bad practice ?

Yonik Seeley-2
On 6/6/07, Frédéric Glorieux <[hidden email]> wrote:

> Tests are going very well with sorl. I'm working for an academic
> project, with not a lot of users, but with high demands, this will
> explain the background of my question. For linguistic activities,
> searching is a goal by itself, retrieving a document may be second.
> That's why it's common to serve thousands of "highlighted snippets" from
> results (in solr terms), like a "concordancer".
>
> In such cases, it seems memory expensive to prepare really big snippets
> lists from StandardRequestHandler. I'm beginning to make it work from a
> ResponseWriter (thanks for all needed code already in
> HighlightingUtils), so that snippets are directly written to the
> response, without storing.
> Before working too much on this code, is it good practice ? Did I miss
> an important reason ?

Simplicity.  The memory usage for highlight fields in normal responses
is not an issue.
If it becomes an issue for you, then you're roughly taking the right approach.

However, rather than write your own response writer to solve your
issue, you might consider
just your own response handler, and insert an Iterable (which will be
written as an array in the response writer).  This way, all response
writers (xml, json, etc) will work.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: Highlight in a response writer, bad practice ?

Frédéric Glorieux

> Simplicity.

The best answer :o)

  The memory usage for highlight fields in normal responses
> is not an issue.
> If it becomes an issue for you, then you're roughly taking the right
> approach.
>
> However, rather than write your own response writer to solve your
> issue, you might consider
> just your own response handler,

I should, but perhaps not for the same reasons as below.

> and insert an Iterable (which will be
> written as an array in the response writer).  This way, all response
> writers (xml, json, etc) will work.

To my opinion, it seems that a KWIC view
<http://en.wikipedia.org/wiki/KWIC> could be just a response writer,
with its own configuration parameters (like size of lines), open to
multiple type of queries. The only input needed is an hits object
implementation.

I will try to think it in the most generic view I'm able to, if some one
could find usage of that...


--
Frédéric Glorieux
École nationale des chartes
direction des nouvelles technologies et de l'informatique