Quantcast

Identify exact search in edismax

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Identify exact search in edismax

rhl4tr
I am using edismax for guessing category from user query.

If user says "I want to buy BMW and Audi car". This query will be fed to edismax which will give me results based on phrase match.

Field contains following values
-BMW => Cars category
-Audi => Cars
-2 BHK => Real Estate
-need job => jobs category
-Buy 1Bhk - Apartments

I get results with phrase matches on top.

Generally top result will be a phrase match (if there are any). How can I know that field's all terms have matched to user query.

e.g.
mm => percentage of user query terms should match with field terms

I want opposite => percentage of field values should match with user query. which is in my case 100% => phrase match

 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Identify exact search in edismax

Mikhail Khludnev
overall task is not clear to me, but if you want to "field's all terms have
matched to user query" i'd suggest to introduce your own Similarity:
 - write number of terms as a norm value (which is by default a byte per
doc per field), then
 - you'll be able to retrieve this number during search time and use for
evaluating your own "mm - criteria".
WDYT?

On Thu, Oct 4, 2012 at 9:28 PM, rhl4tr <[hidden email]> wrote:

> I am using edismax for guessing category from user query.
>
> If user says "I want to buy BMW and Audi car". This query will be fed to
> edismax which will give me results based on phrase match.
>
> Field contains following values
> -BMW => Cars category
> -Audi => Cars
> -2 BHK => Real Estate
> -need job => jobs category
> -Buy 1Bhk - Apartments
>
> I get results with phrase matches on top.
>
> Generally top result will be a phrase match (if there are any). How can I
> know that field's all terms have matched to user query.
>
> e.g.
> mm => percentage of user query terms should match with field terms
>
> I want opposite => percentage of field values should match with user query.
> which is in my case 100% => phrase match
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Identify-exact-search-in-edismax-tp4011859.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



--
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

<http://www.griddynamics.com>
 <[hidden email]>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Identify exact search in edismax

rhl4tr
But user query can contain any number of terms. I can not know how many fields term it has to match.

{
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "mm":"0",
      "sort":"score desc",
      "indent":"true",
      "qf":"exact_keywords",
      "wt":"json",
      "rows":"1",
      "defType":"dismax",
      "pf":"exact_keywords",
      "debugQuery":"false",
      "fl":"data_id,data_name,exact_keywords",
      "start":"0",
      "q":"i want to by honda suzuki",
      "fq":"+data_type:pwords"}},
  "response":{"numFound":2,"start":0,"docs":[
      {
        "data_name":"Cars ",
        "data_id":"71",
        "exact_keywords":"honda suzuki",
        "term_mm":"100%"},
      {
        "data_name":"bikes ",
        "data_id":"72",
        "exact_keywords":"suzuki",
        "term_mm":"50%"}
]
  }}

An hypothetical solution would look like above json response.
user_mm parameter will tell what percentage of terms has matched to user query.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Identify exact search in edismax

Mikhail Khludnev
absolutely, that's what I didn't get in your initial question. Okay it
seems you are talking about typical eCommerce search problem. I will speak
about it at http://www.apachecon.eu/schedule/presentation/18/ see you.

On Fri, Oct 5, 2012 at 9:47 AM, rhl4tr <[hidden email]> wrote:

> But user query can contain any number of terms. I can not know how many
> fields term it has to match.
>
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":1,
>     "params":{
>       "mm":"0",
>       "sort":"score desc",
>       "indent":"true",
>       "qf":"exact_keywords",
>       "wt":"json",
>       "rows":"1",
>       "defType":"dismax",
>       "pf":"exact_keywords",
>       "debugQuery":"false",
>       "fl":"data_id,data_name,exact_keywords",
>       "start":"0",
>       "q":"i want to by honda suzuki",
>       "fq":"+data_type:pwords"}},
>   "response":{"numFound":2,"start":0,"docs":[
>       {
>         "data_name":"Cars ",
>         "data_id":"71",
>         "exact_keywords":"honda suzuki",
>         "term_mm":"100%"},
>       {
>         "data_name":"bikes ",
>         "data_id":"72",
>         "exact_keywords":"suzuki",
>         "term_mm":"50%"}
> ]
>   }}
>
> An hypothetical solution would look like above json response.
> user_mm parameter will tell what percentage of terms has matched to user
> query.
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Identify-exact-search-in-edismax-tp4011859p4011976.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



--
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

<http://www.griddynamics.com>
 <[hidden email]>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Identify exact search in edismax

rhl4tr
Can you please get me started. I can no wait till presentation.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Identify exact search in edismax

Mikhail Khludnev
I have only pencil scratches yet, can't share it. I can say that i've found
it quite close to approach described there
http://www.ulakha.com/publications.html it's called there "Concept Search",
but as far as I understand I have rather different implementation approach.

On Fri, Oct 5, 2012 at 2:31 PM, rhl4tr <[hidden email]> wrote:

> Can you please get me started. I can no wait till presentation.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Identify-exact-search-in-edismax-tp4011859p4012006.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



--
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

<http://www.griddynamics.com>
 <[hidden email]>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Identify exact search in edismax

rhl4tr
I Overrrided DefaultSimilarity class to return idf=1 always

Now score is fully dependent on term matching.

If single term is matching, matching docs have same score.
If phrase is matching it has maximum score.
Loading...