hi all

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

hi all

Bin Shi
Hi all;
    I just have followed Mr. Jack Tang's solution to adopt CJK
analyzer into Nutch 0.6. I know that solution is not perfect. In fact,
I can not get result returned. Can anyone help to adopt CJK analyzer
into Nutch?
   Any response is greatly appreciated!

Best Regards
Reply | Threaded
Open this post in threaded view
|

Re: hi all

Jack.Tang
Hi Bin

The smiplest way is invent cjk-index-basic and cjk-query-basic plugin,
and replace index index-basic and query-basic with them. The invention
is quite simple, you can use CJKTokenizer and CJKAnalyzer in Lucene
project. And take care the query syntax characters in nutch.

  // query syntax characters
| <PLUS: "+" >
| <MINUS: "-" >
| <QUOTE: "\"" >
| <COLON: ":" >
| <SLASH: "/" >
| <DOT: "." >
| <ATSIGN: "@" >
| <APOSTROPHE: "'" >

I will share with you my failure and success of CJK segmentation later.

/Jack

On 7/12/05, Bin Shi <[hidden email]> wrote:
> Hi all;
>    I just have followed Mr. Jack Tang's solution to adopt CJK
> analyzer into Nutch 0.6. I know that solution is not perfect. In fact,
> I can not get result returned. Can anyone help to adopt CJK analyzer
> into Nutch?
>   Any response is greatly appreciated!
>
> Best Regards
>


--
Keep Discovering ... ...
http://www.jroller.com/page/jmars