Nutch Character encoding converter

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Nutch Character encoding converter

Saurabh Suman
hi
Nutch has a auto detector for character encoding. Does it convert character to standard encoding automatically, after detecting it?
Reply | Threaded
Open this post in threaded view
|

Re: Nutch Character encoding converter

kkrugler
>Nutch has a auto detector for character encoding. Does it convert character
>to standard encoding automatically, after detecting it?

Yes - Nutch converts text to Unicode for all subsequent processing.

-- Ken
--
Ken Krugler
+1 530-210-6378
Reply | Threaded
Open this post in threaded view
|

Re: Nutch Character encoding converter

Saurabh Suman
Hi
  As  ken said, nutch converts text to Unicode.Does that mean it parsed text is always in UTF-8 format?
Ken Krugler wrote
>Nutch has a auto detector for character encoding. Does it convert character
>to standard encoding automatically, after detecting it?

Yes - Nutch converts text to Unicode for all subsequent processing.

-- Ken
--
Ken Krugler
+1 530-210-6378