Have a look at Lucene's contrib/:
$ ff \*ISO\*java
./src/test/org/apache/lucene/analysis/TestISOLatin1AccentFilter.java
./src/java/org/apache/lucene/analysis/ISOLatin1AccentFilter.java
Otis
----- Original Message ----
From: Stefan Neufeind <
[hidden email]>
To:
[hidden email]
Sent: Wednesday, July 12, 2006 6:23:36 PM
Subject: Basic character-cleanups easily possible?
Hi,
I wonder if it is somehow easily possible to do basic
character-"cleanups". E.g. most people might expect searching for "cafe"
to find "cafe" and "café" (the latter with accent).
Does this also fall in the category of "stemming", or would it maybe be
a general "optimisation" of words independent of actual language-based
stemming? And at which stage could it be done through which plugin?
Somebody "solved" this already?
Regards,
Stefan