Help needed!!

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Help needed!!

Volkan Ebil
Hi everyone,

 

I am from Turkey. My language has a special char "ğ" .This char  used only
in Turkish and i have to make a language identifier.I have thought that
instead of using ngrams  i can simply check that

if the html content includes "ğ" or not.For this reason I need  an if check
to make the following:

 

Fetch the url

 

if  content of the url includes      "ğ" or "Ğ"

           

            then parse and index the url

else

            skip the url.

 

 

Where should i look in source code ? How can i make such a limitation like
that ?



Thanks in advance

 

Volkan..