Fwd: Arabic words search in solr

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Fwd: Arabic words search in solr

mohanmca01
Hi,

In solr search I want to search with product name using Arabic letters.
While searching, Arabic user can feel little default to search some product
name. Because some characters need to mention while searching.

Ex: إ أ آ


In the above mentioned characters, user can get combination of shift key.
Usually if Arabic people will mention “ ا “  character and will get the
below combined words.

Ex: إبرا


In my solr schema.xml I defined product arabic name field as below


<field name="productNameArabic" type="text_ar" indexed="true"
stored="true"/>


  <fieldType name="text_ar" class="solr.TextField"
positionIncrementGap="100">

      <analyzer>

        <tokenizer class="solr.StandardTokenizerFactory"/>

        <filter class="solr.LowerCaseFilterFactory"/>

        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="lang/stopwords_ar.txt" />

        <filter class="solr.ArabicNormalizationFilterFactory"/>

        <filter class="solr.ArabicStemFilterFactory"/>

      </analyzer>

    </fieldType>



What changes I have do in schame.xml. Please help me on this.



 --
Regards,
Mohan.N
096896429683
Reply | Threaded
Open this post in threaded view
|

Re: Arabic words search in solr

sarowe
Hi Mohan,

I answered your question on the solr-user list.  Did you see my response?

I CC’d you on this email, but you should know that Apache mailing lists won’t automatically send you email unless you have subscribed to the list.  For more information, see <http://lucene.apache.org/solr/community.html#mailing-lists-irc>.

--
Steve
www.lucidworks.com

> On Jan 29, 2017, at 2:16 PM, mohan sundaram <[hidden email]> wrote:
>
> Hi,
>
> In solr search I want to search with product name using Arabic letters.
> While searching, Arabic user can feel little default to search some product
> name. Because some characters need to mention while searching.
>
> Ex: إ أ آ
>
>
> In the above mentioned characters, user can get combination of shift key.
> Usually if Arabic people will mention “ ا “  character and will get the
> below combined words.
>
> Ex: إبرا
>
>
> In my solr schema.xml I defined product arabic name field as below
>
>
> <field name="productNameArabic" type="text_ar" indexed="true"
> stored="true"/>
>
>
>  <fieldType name="text_ar" class="solr.TextField"
> positionIncrementGap="100">
>
>      <analyzer>
>
>        <tokenizer class="solr.StandardTokenizerFactory"/>
>
>        <filter class="solr.LowerCaseFilterFactory"/>
>
>        <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="lang/stopwords_ar.txt" />
>
>        <filter class="solr.ArabicNormalizationFilterFactory"/>
>
>        <filter class="solr.ArabicStemFilterFactory"/>
>
>      </analyzer>
>
>    </fieldType>
>
>
>
> What changes I have do in schame.xml. Please help me on this.
>
>
>
> --
> Regards,
> Mohan.N
> 096896429683

Reply | Threaded
Open this post in threaded view
|

Re: Arabic words search in solr

mohanmca01
Hi Steve,

Thanks for sharing the information.  I looking for your email, but you
replied on solr user community. Now I am subscribed to solr community user
list to get email.

 I went through the solr references document which you shared in the link.
Your shared references document pointing to solr version 6.4.0.

The implemented Solr version in my project is 4.9.0.


As I mentioned earlier In my solr schema.xml I defined product Arabic name
field as below:

/*----------------------------------------------*/

<field name="productNameArabic" type="text_ar" indexed="true"
stored="true"/>



<fieldType name="text_ar" class="solr.TextField"
positionIncrementGap="100">

                <analyzer>

                                <tokenizer
class="solr.StandardTokenizerFactory"/>

                                <filter
class="solr.LowerCaseFilterFactory"/>

                                <filter class="solr.StopFilterFactory"
ignoreCase="true" words="lang/stopwords_ar.txt" />

        <filter class="solr.ArabicNormalizationFilterFactory"/>

        <filter class="solr.ArabicStemFilterFactory"/>

    </analyzer>

</fieldType>

/*----------------------------------------------*/



I am indexing the Arabic content using “text_ar” field type.




*Characters*

*ا*

*أ*

*إ*

*آ*

Shift key Considers for the above

Table 1


These are the example of characters where I’m facing the searching
difficulty.




*Example Indexed words*

*ابرا*

*أبرا*

*إبرا*

*آبرا*

Table 2

These an example of indexed words in Solr.



*Searching word*

*ابرا*

Table 3


Now my problem is, By searching for the above word(table 3) I should get
all indexed words in table 2 in the output.



Is Solr version 4.9.0 compatible with Arabic search or do I need to upgrade
to higher version?


Kindly, do let me know if I need to give an example of all characters since
I gave only for one character which is hamza with alef.


Thanks,

Mohan




On Mon, Jan 30, 2017 at 9:21 PM, Steve Rowe <[hidden email]> wrote:

> Hi Mohan,
>
> I answered your question on the solr-user list.  Did you see my response?
>
> I CC’d you on this email, but you should know that Apache mailing lists
> won’t automatically send you email unless you have subscribed to the list.
> For more information, see <http://lucene.apache.org/
> solr/community.html#mailing-lists-irc>.
>
> --
> Steve
> www.lucidworks.com
>
> > On Jan 29, 2017, at 2:16 PM, mohan sundaram <[hidden email]>
> wrote:
> >
> > Hi,
> >
> > In solr search I want to search with product name using Arabic letters.
> > While searching, Arabic user can feel little default to search some
> product
> > name. Because some characters need to mention while searching.
> >
> > Ex: إ أ آ
> >
> >
> > In the above mentioned characters, user can get combination of shift key.
> > Usually if Arabic people will mention “ ا “  character and will get the
> > below combined words.
> >
> > Ex: إبرا
> >
> >
> > In my solr schema.xml I defined product arabic name field as below
> >
> >
> > <field name="productNameArabic" type="text_ar" indexed="true"
> > stored="true"/>
> >
> >
> >  <fieldType name="text_ar" class="solr.TextField"
> > positionIncrementGap="100">
> >
> >      <analyzer>
> >
> >        <tokenizer class="solr.StandardTokenizerFactory"/>
> >
> >        <filter class="solr.LowerCaseFilterFactory"/>
> >
> >        <filter class="solr.StopFilterFactory" ignoreCase="true"
> > words="lang/stopwords_ar.txt" />
> >
> >        <filter class="solr.ArabicNormalizationFilterFactory"/>
> >
> >        <filter class="solr.ArabicStemFilterFactory"/>
> >
> >      </analyzer>
> >
> >    </fieldType>
> >
> >
> >
> > What changes I have do in schame.xml. Please help me on this.
> >
> >
> >
> > --
> > Regards,
> > Mohan.N
> > 096896429683
>
>


--
Regards,
Mohan.N
9865998919
Reply | Threaded
Open this post in threaded view
|

Re: Arabic words search in solr

sarowe
Hi Mohan,

Please resend this email to the solr-user mailing list - I’ll reply there.

--
Steve
www.lucidworks.com

> On Jan 31, 2017, at 12:58 AM, mohan sundaram <[hidden email]> wrote:
>
> Hi Steve,
>
> Thanks for sharing the information.  I looking for your email, but you replied on solr user community. Now I am subscribed to solr community user list to get email.
>
>  I went through the solr references document which you shared in the link. Your shared references document pointing to solr version 6.4.0.
>
> The implemented Solr version in my project is 4.9.0.
>
>
>
> As I mentioned earlier In my solr schema.xml I defined product Arabic name field as below:
>
> /*----------------------------------------------*/
>
> <field name="productNameArabic" type="text_ar" indexed="true" stored="true"/>
>
>  
> <fieldType name="text_ar" class="solr.TextField" positionIncrementGap="100">
>
>                 <analyzer>
>
>                                 <tokenizer class="solr.StandardTokenizerFactory"/>
>
>                                 <filter class="solr.LowerCaseFilterFactory"/>
>
>                                 <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ar.txt" />
>
>         <filter class="solr.ArabicNormalizationFilterFactory"/>
>
>         <filter class="solr.ArabicStemFilterFactory"/>
>
>     </analyzer>
>
> </fieldType>
>
> /*----------------------------------------------*/
>
>
>
>
>
> I am indexing the Arabic content using “text_ar” field type.
>
>
>
>  
> Characters
> ا
> أ
> إ
> آ
> Shift key Considers for the above
> Table 1
>
>
>
> These are the example of characters where I’m facing the searching difficulty.
>
>
>
>  
> Example Indexed words
> ابرا
> أبرا
> إبرا
> آبرا
> Table 2
>
> These an example of indexed words in Solr.
>
>  
> Searching word
> ابرا
> Table 3
>
>
>
> Now my problem is, By searching for the above word(table 3) I should get all indexed words in table 2 in the output.
>
>  
>
> Is Solr version 4.9.0 compatible with Arabic search or do I need to upgrade to higher version?
>
>
>
>
> Kindly, do let me know if I need to give an example of all characters since I gave only for one character which is hamza with alef.
>
>
>
> Thanks,
>
> Mohan
>
>
>
>
>
>
> On Mon, Jan 30, 2017 at 9:21 PM, Steve Rowe <[hidden email]> wrote:
> Hi Mohan,
>
> I answered your question on the solr-user list.  Did you see my response?
>
> I CC’d you on this email, but you should know that Apache mailing lists won’t automatically send you email unless you have subscribed to the list.  For more information, see <http://lucene.apache.org/solr/community.html#mailing-lists-irc>.
>
> --
> Steve
> www.lucidworks.com
>
> > On Jan 29, 2017, at 2:16 PM, mohan sundaram <[hidden email]> wrote:
> >
> > Hi,
> >
> > In solr search I want to search with product name using Arabic letters.
> > While searching, Arabic user can feel little default to search some product
> > name. Because some characters need to mention while searching.
> >
> > Ex: إ أ آ
> >
> >
> > In the above mentioned characters, user can get combination of shift key.
> > Usually if Arabic people will mention “ ا “  character and will get the
> > below combined words.
> >
> > Ex: إبرا
> >
> >
> > In my solr schema.xml I defined product arabic name field as below
> >
> >
> > <field name="productNameArabic" type="text_ar" indexed="true"
> > stored="true"/>
> >
> >
> >  <fieldType name="text_ar" class="solr.TextField"
> > positionIncrementGap="100">
> >
> >      <analyzer>
> >
> >        <tokenizer class="solr.StandardTokenizerFactory"/>
> >
> >        <filter class="solr.LowerCaseFilterFactory"/>
> >
> >        <filter class="solr.StopFilterFactory" ignoreCase="true"
> > words="lang/stopwords_ar.txt" />
> >
> >        <filter class="solr.ArabicNormalizationFilterFactory"/>
> >
> >        <filter class="solr.ArabicStemFilterFactory"/>
> >
> >      </analyzer>
> >
> >    </fieldType>
> >
> >
> >
> > What changes I have do in schame.xml. Please help me on this.
> >
> >
> >
> > --
> > Regards,
> > Mohan.N
> > 096896429683
>
>
>
>
> --
> Regards,
> Mohan.N
> 9865998919
>