How to search for europian word with and without special characters

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

How to search for europian word with and without special characters

Supriya Kumar Shyamal
Hi All,

I have a question regarding the indexing and searching for german
characters. For eg. when I search for the word "müller" also I want to
search for the word "mueller". How to achieve this in lucene.

Thanks,
supriya

--
Mit freundlichen Grüßen / Regards
 
Supriya Kumar Shyamal

Software Developer
tel +49 (30) 443 50 99 -22
fax +49 (30) 443 50 99 -99
email [hidden email]
___________________________
artnology GmbH
Milastr. 4
10437 Berlin
___________________________

http://www.artnology.com
__________________________________________________________________________

 News / Aktuelle Projekte:
 * artnology gewinnt Ausschreibung des Bundesministeriums des Innern:
   Softwarelösung für die Verwaltung der Sammlung zeitgenössischer
   Kunstwerke zur kulturellen Repräsentation des Bundes.

 Projektreferenzen:
 * Globaler eShop und Corporate-Site für Springer: www.springeronline.com
 * E-Detailing-Portal für Novartis: www.interaktiv.novartis.de
 * Service-Center-Plattform für Biogen: www.ms-life.de
 * eCRM-System für Grünenthal: www.gruenenthal.com

___________________________________________________________________________


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: How to search for europian word with and without special characters

Mile Rosu
Hello Supriya,

One possibility would be to search for both müller and mueller from the interface. It means you should "normalize" in some way the search query you are doing. This solution would not affect the content of the existing index (no reindexing needed).

Greets,
Mile

-----Original Message-----
From: Supriya Kumar Shyamal [mailto:[hidden email]]
Sent: Tuesday, June 20, 2006 3:09 PM
To: [hidden email]
Subject: How to search for europian word with and without special characters

Hi All,

I have a question regarding the indexing and searching for german
characters. For eg. when I search for the word "müller" also I want to
search for the word "mueller". How to achieve this in lucene.

Thanks,
supriya

--
Mit freundlichen Grüßen / Regards
 
Supriya Kumar Shyamal

Software Developer
tel +49 (30) 443 50 99 -22
fax +49 (30) 443 50 99 -99
email [hidden email]
___________________________
artnology GmbH
Milastr. 4
10437 Berlin
___________________________

http://www.artnology.com
__________________________________________________________________________

 News / Aktuelle Projekte:
 * artnology gewinnt Ausschreibung des Bundesministeriums des Innern:
   Softwarelösung für die Verwaltung der Sammlung zeitgenössischer
   Kunstwerke zur kulturellen Repräsentation des Bundes.

 Projektreferenzen:
 * Globaler eShop und Corporate-Site für Springer: www.springeronline.com
 * E-Detailing-Portal für Novartis: www.interaktiv.novartis.de
 * Service-Center-Plattform für Biogen: www.ms-life.de
 * eCRM-System für Grünenthal: www.gruenenthal.com

___________________________________________________________________________


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to search for europian word with and without special characters

Otis Gospodnetic-2
In reply to this post by Supriya Kumar Shyamal
I think you'll want to write your own Analyzer + Tokenizer, detect tokens with umlauts, and then emit two tokens at the same position (think of them as synonyms), one being the original one with the umlaut, and the other one with the umlaut transformed according to the rules (e.g. ü -> ue).  Hm, I wonder if GermanAnalyzer already does this... maybe, have a look.

Otis

----- Original Message ----
From: Supriya Kumar Shyamal <[hidden email]>
To: [hidden email]
Sent: Tuesday, June 20, 2006 8:09:18 AM
Subject: How to search for europian word with and without special characters

Hi All,

I have a question regarding the indexing and searching for german
characters. For eg. when I search for the word "müller" also I want to
search for the word "mueller". How to achieve this in lucene.

Thanks,
supriya

--
Mit freundlichen Grüßen / Regards
 
Supriya Kumar Shyamal

Software Developer
tel +49 (30) 443 50 99 -22
fax +49 (30) 443 50 99 -99
email [hidden email]
___________________________
artnology GmbH
Milastr. 4
10437 Berlin
___________________________

http://www.artnology.com
__________________________________________________________________________

 News / Aktuelle Projekte:
 * artnology gewinnt Ausschreibung des Bundesministeriums des Innern:
   Softwarelösung für die Verwaltung der Sammlung zeitgenössischer
   Kunstwerke zur kulturellen Repräsentation des Bundes.

 Projektreferenzen:
 * Globaler eShop und Corporate-Site für Springer: www.springeronline.com
 * E-Detailing-Portal für Novartis: www.interaktiv.novartis.de
 * Service-Center-Plattform für Biogen: www.ms-life.de
 * eCRM-System für Grünenthal: www.gruenenthal.com

___________________________________________________________________________


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]





---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to search for europian word with and without special characters

Chris Hostetter-3
In reply to this post by Supriya Kumar Shyamal

take a look at the ISOLatin1AccentFilter .. it doesn't seem to do exactly
what you want (replacing "ü" with "ue" .. it just uses "u") but it should
give you an idea of what you can do.

There was also a discussion recently about how you can use a modified
version of this Filter at index time to get both versions of the word
indexed with the same position...

http://www.nabble.com/Question-about-special-characters-t1676416.html#a4545641



: Date: Tue, 20 Jun 2006 14:09:18 +0200
: From: Supriya Kumar Shyamal <[hidden email]>
: Reply-To: [hidden email]
: To: [hidden email]
: Subject: How to search for europian word with and without special
:     characters
:
: Hi All,
:
: I have a question regarding the indexing and searching for german
: characters. For eg. when I search for the word "müller" also I want to
: search for the word "mueller". How to achieve this in lucene.
:
: Thanks,
: supriya
:
: --
: Mit freundlichen Grüßen / Regards
:
: Supriya Kumar Shyamal
:
: Software Developer
: tel +49 (30) 443 50 99 -22
: fax +49 (30) 443 50 99 -99
: email [hidden email]
: ___________________________
: artnology GmbH
: Milastr. 4
: 10437 Berlin
: ___________________________
:
: http://www.artnology.com
: __________________________________________________________________________
:
:  News / Aktuelle Projekte:
:  * artnology gewinnt Ausschreibung des Bundesministeriums des Innern:
:    Softwarelösung für die Verwaltung der Sammlung zeitgenössischer
:    Kunstwerke zur kulturellen Repräsentation des Bundes.
:
:  Projektreferenzen:
:  * Globaler eShop und Corporate-Site für Springer: www.springeronline.com
:  * E-Detailing-Portal für Novartis: www.interaktiv.novartis.de
:  * Service-Center-Plattform für Biogen: www.ms-life.de
:  * eCRM-System für Grünenthal: www.gruenenthal.com
:
: ___________________________________________________________________________
:
:
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: [hidden email]
: For additional commands, e-mail: [hidden email]
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]