Underlined Phrases

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Underlined Phrases

ocramp
Hi,

 If you have a page named: chocolate_cake.html and if you search for
"chocolate", the page will not be found.
 Do you know a quick solution to make Nutch retrieve chocolate_cake.html for
a chocolate or cake search?

 I'm not very sure but I think it is related to the words analyzing  section
of the program.  In Lucene there are  several  analyzers from wich you can
choose before indexing pages.
 Hope you get me.

Thanks,
Marco
Reply | Threaded
Open this post in threaded view
|

Re: Underlined Phrases

ocramp
Solved!

 For my purpose I did a java string replace on the  "_" by " " and it is
fine know.
 I think this is related with tokenizers.

Thanks anyway,
Marco

On 8/16/06, Marco Vanossi <[hidden email]> wrote:

>
> Hi,
>
>  If you have a page named: chocolate_cake.html and if you search for
> "chocolate", the page will not be found.
>  Do you know a quick solution to make Nutch retrieve c "hocolate_cake.html
> for a chocolate or cake search?
>
>  I'm not very sure but I think it is related to the words analyzing
> section of the program.  In Lucene there are  several  analyzers from wich
> you can choose before indexing pages.
>  Hope you get me.
>
> Thanks,
> Marco
>