Korean script conversion

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Korean script conversion

Eyal  Naamati

Hi,

 

We are starting to index records in Korean. Korean text can be written in two scripts: Han characters (Chinese) and Hangul characters (Korean).

We are looking for some solr filter or another built in solr component that converts between Han and Hangul characters (transliteration).

I know there is the ICUTransformFilterFactory that can convert between Japanese or chinese scripts, for example:

<filter class="solr.ICUTransformFilterFactory" id="Katakana- Hiragana"/> for Japanese script conversions

So far I couldn't find anything readymade for Korean scripts, but perhaps someone knows of one?

 

Thanks!

Eyal Naamati
Alma Developer
Tel: +972-2-6499313
Mobile: +972-547915255
[hidden email]

www.exlibrisgroup.com

 

Reply | Threaded
Open this post in threaded view
|

Re: Korean script conversion

Benson Margulies
Why do you think that this is a good idea? Hanja are used for special
purposes; they are not trivally convertable to Hanjul due to ambiguity, and
it's not at all clear that a typical search user wants to treat them as
equivalent.

On Sun, Mar 29, 2015 at 1:52 AM, Eyal Naamati <
[hidden email]> wrote:

>  Hi,
>
>
>
> We are starting to index records in Korean. Korean text can be written in
> two scripts: Han characters (Chinese) and Hangul characters (Korean).
>
> We are looking for some solr filter or another built in solr component
> that converts between Han and Hangul characters (transliteration).
>
> I know there is the ICUTransformFilterFactory that can convert between
> Japanese or chinese scripts, for example:
>
> <filter class=*"solr.ICUTransformFilterFactory"* id=*"Katakana- Hiragana"*
> /> for Japanese script conversions
>
> So far I couldn't find anything readymade for Korean scripts, but perhaps
> someone knows of one?
>
>
>
> Thanks!
>
> Eyal Naamati
> Alma Developer
> Tel: +972-2-6499313
> Mobile: +972-547915255
> [hidden email]
> [image: Description: Description: Description: Description:
> C://signature/exlibris.jpg]
> www.exlibrisgroup.com
>
>
>
Reply | Threaded
Open this post in threaded view
|

RE: Korean script conversion

Eyal  Naamati
We only want the conversion Hanja->Hangul, for each Hanja character there exists only one Hangul character that can replace it in a Korean text.
The other way around is not convertible.
We want to allow searching in both scripts and find matches in both scripts.
 Thanks

Eyal Naamati
Alma Developer
Tel: +972-2-6499313
Mobile: +972-547915255
[hidden email]

www.exlibrisgroup.com

-----Original Message-----
From: Benson Margulies [mailto:[hidden email]]
Sent: Monday, March 30, 2015 1:58 PM
To: solr-user
Subject: Re: Korean script conversion

Why do you think that this is a good idea? Hanja are used for special purposes; they are not trivally convertable to Hanjul due to ambiguity, and it's not at all clear that a typical search user wants to treat them as equivalent.

On Sun, Mar 29, 2015 at 1:52 AM, Eyal Naamati < [hidden email]> wrote:

>  Hi,
>
>
>
> We are starting to index records in Korean. Korean text can be written
> in two scripts: Han characters (Chinese) and Hangul characters (Korean).
>
> We are looking for some solr filter or another built in solr component
> that converts between Han and Hangul characters (transliteration).
>
> I know there is the ICUTransformFilterFactory that can convert between
> Japanese or chinese scripts, for example:
>
> <filter class=*"solr.ICUTransformFilterFactory"* id=*"Katakana-
> Hiragana"* /> for Japanese script conversions
>
> So far I couldn't find anything readymade for Korean scripts, but
> perhaps someone knows of one?
>
>
>
> Thanks!
>
> Eyal Naamati
> Alma Developer
> Tel: +972-2-6499313
> Mobile: +972-547915255
> [hidden email]
> [image: Description: Description: Description: Description:
> C://signature/exlibris.jpg]
> www.exlibrisgroup.com
>
>
>
Reply | Threaded
Open this post in threaded view
|

RE: Korean script conversion

Eyal  Naamati
In reply to this post by Eyal Naamati

Trying again since I don't have an answer yet.

Thanks!

 

Eyal Naamati
Alma Developer
Tel: +972-2-6499313
Mobile: +972-547915255
[hidden email]

www.exlibrisgroup.com

 

From: Eyal Naamati
Sent: Sunday, March 29, 2015 7:52 AM
To: [hidden email]
Subject: Korean script conversion

 

Hi,

 

We are starting to index records in Korean. Korean text can be written in two scripts: Han characters (Chinese) and Hangul characters (Korean).

We are looking for some solr filter or another built in solr component that converts between Han and Hangul characters (transliteration).

I know there is the ICUTransformFilterFactory that can convert between Japanese or chinese scripts, for example:

<filter class="solr.ICUTransformFilterFactory" id="Katakana- Hiragana"/> for Japanese script conversions

So far I couldn't find anything readymade for Korean scripts, but perhaps someone knows of one?

 

Thanks!

Eyal Naamati
Alma Developer
Tel: +972-2-6499313
Mobile: +972-547915255
[hidden email]

www.exlibrisgroup.com