Get TermVectors for query hits only

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Get TermVectors for query hits only

Walter Ravenek
Hi all,

When I'm using the TermVectorComponent I receive term vectors with all
tokens in the documents that meet my search criteria. I would be
interested in getting the offsets for just those terms in the documents
that meet the search citeria. My documents are about 200 K and are in
XML. If I have just the offsets for the hits, I can easily implement my
own highligting on the client side.

Does anyone know how to go about doing this?

Reply | Threaded
Open this post in threaded view
|

Re: Get TermVectors for query hits only

Grant Ingersoll-2
I seem to recall that the Highlighter in Solr is pluggable, so you may  
want to work at that level instead of the client side.  Otherwise, you  
likely would have to implement your own TermVectorMapper and add that  
to the TermVectorComponent capability which then feeds your client.

For an example of using TermVectorMapper, but not solving exactly your  
problem (but close), see http://www.lucidimagination.com/blog/2009/05/26/accessing-words-around-a-positional-match-in-lucene/ 
  but note that is at the Lucene level.


On Jul 13, 2009, at 2:37 PM, Walter Ravenek wrote:

> Hi all,
>
> When I'm using the TermVectorComponent I receive term vectors with  
> all tokens in the documents that meet my search criteria. I would be  
> interested in getting the offsets for just those terms in the  
> documents that meet the search citeria. My documents are about 200 K  
> and are in XML. If I have just the offsets for the hits, I can  
> easily implement my own highligting on the client side.
>
> Does anyone know how to go about doing this?
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search

Reply | Threaded
Open this post in threaded view
|

Re: Get TermVectors for query hits only

Walter Ravenek
Thanks Grant,

I think I get the idea.


Grant Ingersoll wrote:

> I seem to recall that the Highlighter in Solr is pluggable, so you may
> want to work at that level instead of the client side.  Otherwise, you
> likely would have to implement your own TermVectorMapper and add that
> to the TermVectorComponent capability which then feeds your client.
>
> For an example of using TermVectorMapper, but not solving exactly your
> problem (but close), see
> http://www.lucidimagination.com/blog/2009/05/26/accessing-words-around-a-positional-match-in-lucene/ but
> note that is at the Lucene level.
>
>
> On Jul 13, 2009, at 2:37 PM, Walter Ravenek wrote:
>
>> Hi all,
>>
>> When I'm using the TermVectorComponent I receive term vectors with
>> all tokens in the documents that meet my search criteria. I would be
>> interested in getting the offsets for just those terms in the
>> documents that meet the search citeria. My documents are about 200 K
>> and are in XML. If I have just the offsets for the hits, I can easily
>> implement my own highligting on the client side.
>>
>> Does anyone know how to go about doing this?
>>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
> using Solr/Lucene:
> http://www.lucidimagination.com/search
>
> ------------------------------------------------------------------------
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 8.5.387 / Virus Database: 270.13.12/2234 - Release Date: 07/12/09 17:56:00
>
>