Accented chars (Portuguese)

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Accented chars (Portuguese)

Lucas F. A. Teixeira-2
Hello all,

I'm using the solr.ISOLatin1AccentFilterFactory TokenFilter in my schema.xml inside both <index> and <query> tag, but I'm having some continuous problemas with accented chars in portuguese (áéíóúàèìòùãĩõũäëïöü.....). And this is making my search engin handle this type of queries annormally.

I think the IsoLatin Filter it's ok, once I'm having the same results searching with the accented chars or not. My problem is that it seems the IsoLatin Filter it's just ignoring these chars, and not replacing by its unaccented chars (like its docs says). For example, I've indexed one document whit the title:

Barraca Cocoricó - Multibrink

And when I query the word: cocoricó I can't get the document. When I search the word cocorico, I still can't get this document. But when I search for cocoric there is my document.

This is my indexing schema

Have anybody had these same problems sometime?

Thank you all,

[]s,

Lucas

Reply | Threaded
Open this post in threaded view
|

RE: Accented chars (Portuguese)

steve_rowe
Hi Lucas,

Are you using any stemming?

Steve

On 02/28/2008 at 6:50 AM, Lucas Teixeira wrote:

> Hello all,
>
> I'm using the solr.ISOLatin1AccentFilterFactory TokenFilter in my
> schema.xml inside both <index> and <query> tag, but I'm having some
> continuous problemas with accented chars in portuguese
> (áéíóúàèìòùãiõuäëïöü.....). And this is making my search engin handle
> this type of queries annormally.
>
> I think the IsoLatin Filter it's ok, once I'm having the same results
> searching with the accented chars or not. My problem is that it seems
> the IsoLatin Filter it's just ignoring these chars, and not
> replacing by
> its unaccented chars (like its docs says). For example, I've
> indexed one
> document whit the title:
>
> Barraca Cocoricó - Multibrink
>
> And when I query the word: cocoricó I can't get the document. When I
> search the word cocorico, I still can't get this document. But when I
> search for cocoric there is my document.
>
> This is my indexing schema
>
> Have anybody had these same problems sometime?
>
> Thank you all,
>
> []s,
>
> Lucas
>
>
>