SOLR/LUCENE 5.2.1: Solution of CharTermAtt, StartOffset, EndOffset, Position

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

SOLR/LUCENE 5.2.1: Solution of CharTermAtt, StartOffset, EndOffset, Position

vankimchau
Hi,

I'm looking a solution for the following format in solr/lucene 5.2.1 version:
Text eg: "fast wi fi network is down". If using solr.StandardTokenizerFactory , I have the "Position " corresponding to displayed : fast ( 1 ) - > wi ( 2 ) - > fi ( 3 ) - > Network ( 4 ) - > is ( 5 ) - - > down ( 6 ) . But I need you just create a new custom or class to the question above is "fast wi fi network is down" but the analysis is currently Position as follows : fast ( 1 ) - > fi ( 2 ) - > is ( 3 ) or wi ( 1 ) - > network ( 2 ) - > down ( 3 ) . I know it involves startOffset , endOffset ... but I can not figure out how to solve?
Thanks in advance!






---------------------------
VĂN KIM CHÂU
[P]: +84.933.233.047
Reply | Threaded
Open this post in threaded view
|

Re: SOLR/LUCENE 5.2.1: Solution of CharTermAtt, StartOffset, EndOffset, Position

Shai Erera
I think you can just write a TokenFilter which sets the
PositionIncrementAttribute of every other token to 0. Then you can use
StandardTokenizer and wrap it with that filter.

Shai
On Aug 8, 2015 6:33 AM, "Văn Châu" <[hidden email]> wrote:

> Hi,
>
> I'm looking a solution for the following format in solr/lucene 5.2.1
> version:
> Text eg: "fast wi fi network is down". If using
> solr.StandardTokenizerFactory , I have the "Position " corresponding to
> displayed : fast ( 1 ) - > wi ( 2 ) - > fi ( 3 ) - > Network ( 4 ) - > is (
> 5 ) - - > down ( 6 ) . But I need you just create a new custom or class to
> the question above is "fast wi fi network is down" but the analysis is
> currently Position as follows : fast ( 1 ) - > fi ( 2 ) - > is ( 3 ) or wi
> ( 1 ) - > network ( 2 ) - > down ( 3 ) . I know it involves startOffset ,
> endOffset ... but I can not figure out how to solve?
> Thanks in advance!
>
>
> [image: Hình ảnh nội tuyến 1]
>
>
>
> ---------------------------
> VĂN KIM CHÂU
> [P]: +84.933.233.047
>