AnalyzingSuggester returning index value instead of field value?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

AnalyzingSuggester returning index value instead of field value?

Sebastian Saip
I'm looking into a way to implement an autosuggest and for my special needs
(I'm doing a "startsWith"-search that should retrieve the full name, which
may have accents - However, I want to search with/without accents and in
any upper/lowercase for comfort)

Here's part of my configuration: http://pastebin.com/20vSGJ1a

So I have a name="Têst Námè" and I query for "test", "tést", "TÈST", or
similiar. This gives me back "test name" as a suggestion, which looks like
the index, rather than the actual value.

Furthermore, when I fed the document without index-analyzers, then added
the index-analyzers, restarted without refeeding and queried, it returned
the right value (so this seems to retrieve the index, rather than the
actual stored value?)

Or maybe I just configured it the wrong way :?
Theres not really much documentation about this yet :(

BR Sebastian Saip
Reply | Threaded
Open this post in threaded view
|

Re: AnalyzingSuggester returning index value instead of field value?

Michael McCandless-2
I'm not very familiar with how AnalyzingSuggester works inside Solr
... if you try this directly with the Lucene APIs does it still
happen?

Hmm maybe one idea: if you remove whitespace from your suggestion does
it work?  I wonder if there's a whitespace / multi-token issue ... if
so then maybe see how TestPhraseSuggestions.java (in Solr) does this?

Mike McCandless

http://blog.mikemccandless.com

On Thu, Feb 7, 2013 at 9:48 AM, Sebastian Saip <[hidden email]> wrote:

> I'm looking into a way to implement an autosuggest and for my special needs
> (I'm doing a "startsWith"-search that should retrieve the full name, which
> may have accents - However, I want to search with/without accents and in
> any upper/lowercase for comfort)
>
> Here's part of my configuration: http://pastebin.com/20vSGJ1a
>
> So I have a name="Têst Námè" and I query for "test", "tést", "TÈST", or
> similiar. This gives me back "test name" as a suggestion, which looks like
> the index, rather than the actual value.
>
> Furthermore, when I fed the document without index-analyzers, then added
> the index-analyzers, restarted without refeeding and queried, it returned
> the right value (so this seems to retrieve the index, rather than the
> actual stored value?)
>
> Or maybe I just configured it the wrong way :?
> Theres not really much documentation about this yet :(
>
> BR Sebastian Saip
Reply | Threaded
Open this post in threaded view
|

Re: AnalyzingSuggester returning index value instead of field value?

Sebastian Saip
It's the same with whitespace removed unfortunately - still getting back
"testname" then.
I'm not quite sure how to test this via the Lucene API - in particular, how
to define the KeywordTokenizer with ASCII+LowerCase, so I can't test this
atm :/

BR Sebastian Saip


On 7 February 2013 16:19, Michael McCandless <[hidden email]>wrote:

> I'm not very familiar with how AnalyzingSuggester works inside Solr
> ... if you try this directly with the Lucene APIs does it still
> happen?
>
> Hmm maybe one idea: if you remove whitespace from your suggestion does
> it work?  I wonder if there's a whitespace / multi-token issue ... if
> so then maybe see how TestPhraseSuggestions.java (in Solr) does this?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Thu, Feb 7, 2013 at 9:48 AM, Sebastian Saip <[hidden email]>
> wrote:
> > I'm looking into a way to implement an autosuggest and for my special
> needs
> > (I'm doing a "startsWith"-search that should retrieve the full name,
> which
> > may have accents - However, I want to search with/without accents and in
> > any upper/lowercase for comfort)
> >
> > Here's part of my configuration: http://pastebin.com/20vSGJ1a
> >
> > So I have a name="Têst Námè" and I query for "test", "tést", "TÈST", or
> > similiar. This gives me back "test name" as a suggestion, which looks
> like
> > the index, rather than the actual value.
> >
> > Furthermore, when I fed the document without index-analyzers, then added
> > the index-analyzers, restarted without refeeding and queried, it returned
> > the right value (so this seems to retrieve the index, rather than the
> > actual stored value?)
> >
> > Or maybe I just configured it the wrong way :?
> > Theres not really much documentation about this yet :(
> >
> > BR Sebastian Saip
>
Reply | Threaded
Open this post in threaded view
|

Re: AnalyzingSuggester returning index value instead of field value?

Sebastian Saip
The solution, as pointed out on
http://stackoverflow.com/questions/14732713/solr-autosuggest-with-diacritics/14743278
,
is not to use a copyField but instead use the AnalyzingSuggester on the
StrField directly.

Cheers!


On 7 February 2013 17:30, Sebastian Saip <[hidden email]> wrote:

> It's the same with whitespace removed unfortunately - still getting back
> "testname" then.
> I'm not quite sure how to test this via the Lucene API - in particular,
> how to define the KeywordTokenizer with ASCII+LowerCase, so I can't test
> this atm :/
>
> BR Sebastian Saip
>
>
> On 7 February 2013 16:19, Michael McCandless <[hidden email]>wrote:
>
>> I'm not very familiar with how AnalyzingSuggester works inside Solr
>> ... if you try this directly with the Lucene APIs does it still
>> happen?
>>
>> Hmm maybe one idea: if you remove whitespace from your suggestion does
>> it work?  I wonder if there's a whitespace / multi-token issue ... if
>> so then maybe see how TestPhraseSuggestions.java (in Solr) does this?
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>> On Thu, Feb 7, 2013 at 9:48 AM, Sebastian Saip <[hidden email]>
>> wrote:
>> > I'm looking into a way to implement an autosuggest and for my special
>> needs
>> > (I'm doing a "startsWith"-search that should retrieve the full name,
>> which
>> > may have accents - However, I want to search with/without accents and in
>> > any upper/lowercase for comfort)
>> >
>> > Here's part of my configuration: http://pastebin.com/20vSGJ1a
>> >
>> > So I have a name="Têst Námè" and I query for "test", "tést", "TÈST", or
>> > similiar. This gives me back "test name" as a suggestion, which looks
>> like
>> > the index, rather than the actual value.
>> >
>> > Furthermore, when I fed the document without index-analyzers, then added
>> > the index-analyzers, restarted without refeeding and queried, it
>> returned
>> > the right value (so this seems to retrieve the index, rather than the
>> > actual stored value?)
>> >
>> > Or maybe I just configured it the wrong way :?
>> > Theres not really much documentation about this yet :(
>> >
>> > BR Sebastian Saip
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: AnalyzingSuggester returning index value instead of field value?

Alexandre Rafalovitch
Glad it helped. :-)

Now, if you could write this up as a full example and explanation, I am
sure Solr community would benefit from it as well. If you don't have your
own blog, I would be happy to guest host it, as I am sure  would at least a
couple more people/organizations.

Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Thu, Feb 7, 2013 at 12:44 PM, Sebastian Saip <[hidden email]>wrote:

> The solution, as pointed out on
>
> http://stackoverflow.com/questions/14732713/solr-autosuggest-with-diacritics/14743278
> ,
> is not to use a copyField but instead use the AnalyzingSuggester on the
> StrField directly.
>
> Cheers!
>
>
> On 7 February 2013 17:30, Sebastian Saip <[hidden email]> wrote:
>
> > It's the same with whitespace removed unfortunately - still getting back
> > "testname" then.
> > I'm not quite sure how to test this via the Lucene API - in particular,
> > how to define the KeywordTokenizer with ASCII+LowerCase, so I can't test
> > this atm :/
> >
> > BR Sebastian Saip
> >
> >
> > On 7 February 2013 16:19, Michael McCandless <[hidden email]
> >wrote:
> >
> >> I'm not very familiar with how AnalyzingSuggester works inside Solr
> >> ... if you try this directly with the Lucene APIs does it still
> >> happen?
> >>
> >> Hmm maybe one idea: if you remove whitespace from your suggestion does
> >> it work?  I wonder if there's a whitespace / multi-token issue ... if
> >> so then maybe see how TestPhraseSuggestions.java (in Solr) does this?
> >>
> >> Mike McCandless
> >>
> >> http://blog.mikemccandless.com
> >>
> >> On Thu, Feb 7, 2013 at 9:48 AM, Sebastian Saip <
> [hidden email]>
> >> wrote:
> >> > I'm looking into a way to implement an autosuggest and for my special
> >> needs
> >> > (I'm doing a "startsWith"-search that should retrieve the full name,
> >> which
> >> > may have accents - However, I want to search with/without accents and
> in
> >> > any upper/lowercase for comfort)
> >> >
> >> > Here's part of my configuration: http://pastebin.com/20vSGJ1a
> >> >
> >> > So I have a name="Têst Námè" and I query for "test", "tést", "TÈST",
> or
> >> > similiar. This gives me back "test name" as a suggestion, which looks
> >> like
> >> > the index, rather than the actual value.
> >> >
> >> > Furthermore, when I fed the document without index-analyzers, then
> added
> >> > the index-analyzers, restarted without refeeding and queried, it
> >> returned
> >> > the right value (so this seems to retrieve the index, rather than the
> >> > actual stored value?)
> >> >
> >> > Or maybe I just configured it the wrong way :?
> >> > Theres not really much documentation about this yet :(
> >> >
> >> > BR Sebastian Saip
> >>
> >
> >
>