Help needed!!

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Help needed!!

Volkan Ebil
Hi everyone,


I am from Turkey. My language has a special char "ğ" .This char  used only
in Turkish and i have to make a language identifier.I have thought that
instead of using ngrams  i can simply check that

if the html content includes "ğ" or not.For this reason I need  an if check
to make the following:


Fetch the url


if  content of the url includes      "ğ" or "Ğ"


            then parse and index the url


            skip the url.



Where should i look in source code ? How can i make such a limitation like
that ?

Thanks in advance