Adaptive search?

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Adaptive search?

Siddhant Goel
Hi,

Does Solr provide adaptive searching? Can it adapt to user clicks within the
search results it provides? Or that has to be done externally?

I couldn't find anything on googling for it.

Thanks,

--
- Siddhant
Reply | Threaded
Open this post in threaded view
|

Re: Adaptive search?

Paul Libbrecht
What can it mean to "adapt to user clicks" ? Quite many things in my  
head.
Do you have maybe a citation that inspires you here?

paul


Le 17-déc.-09 à 13:52, Siddhant Goel a écrit :

> Does Solr provide adaptive searching? Can it adapt to user clicks  
> within the
> search results it provides? Or that has to be done externally?

Reply | Threaded
Open this post in threaded view
|

Re: Adaptive search?

Siddhant Goel
Let say we have a search engine (a simple front end - web app kind of a
thing - responsible for querying Solr and then displaying the results in a
human readable form) based on Solr. If a user searches for something, gets
quite a few search results, and then clicks on one such result - is there
any mechanism by which we can notify Solr to boost the score/relevance of
that particular result in future searches? If not, then any pointers on how
to go about doing that would be very helpful.

Thanks,

On Thu, Dec 17, 2009 at 7:50 PM, Paul Libbrecht <[hidden email]> wrote:

> What can it mean to "adapt to user clicks" ? Quite many things in my head.
> Do you have maybe a citation that inspires you here?
>
> paul
>
>
> Le 17-déc.-09 à 13:52, Siddhant Goel a écrit :
>
>
>  Does Solr provide adaptive searching? Can it adapt to user clicks within
>> the
>> search results it provides? Or that has to be done externally?
>>
>
>


--
- Siddhant
Reply | Threaded
Open this post in threaded view
|

Re: Adaptive search?

Alexey Serba
You can add click counts to your index as additional field and boost
results based on that value.

http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_change_the_score_of_a_document_based_on_the_.2Avalue.2A_of_a_field_.28say.2C_.22popularity.22.29

You can keep some kind of buffer for clicks and update click count
field for documents in the index periodically.

If you don't want to update whole documents in the index then you
probably should look at ExternalFileField or Lucene ParallelReader as
a custom Solr IndexReader, but this is complex low level Lucene stuff
and requires some hacking.

Alex

On Thu, Dec 17, 2009 at 6:46 PM, Siddhant Goel <[hidden email]> wrote:

> Let say we have a search engine (a simple front end - web app kind of a
> thing - responsible for querying Solr and then displaying the results in a
> human readable form) based on Solr. If a user searches for something, gets
> quite a few search results, and then clicks on one such result - is there
> any mechanism by which we can notify Solr to boost the score/relevance of
> that particular result in future searches? If not, then any pointers on how
> to go about doing that would be very helpful.
>
> Thanks,
>
> On Thu, Dec 17, 2009 at 7:50 PM, Paul Libbrecht <[hidden email]> wrote:
>
>> What can it mean to "adapt to user clicks" ? Quite many things in my head.
>> Do you have maybe a citation that inspires you here?
>>
>> paul
>>
>>
>> Le 17-déc.-09 à 13:52, Siddhant Goel a écrit :
>>
>>
>>  Does Solr provide adaptive searching? Can it adapt to user clicks within
>>> the
>>> search results it provides? Or that has to be done externally?
>>>
>>
>>
>
>
> --
> - Siddhant
>
Reply | Threaded
Open this post in threaded view
|

Re: Adaptive search?

Ian Holsman (Lists)
In reply to this post by Siddhant Goel
On 12/18/09 2:46 AM, Siddhant Goel wrote:
> Let say we have a search engine (a simple front end - web app kind of a
> thing - responsible for querying Solr and then displaying the results in a
> human readable form) based on Solr. If a user searches for something, gets
> quite a few search results, and then clicks on one such result - is there
> any mechanism by which we can notify Solr to boost the score/relevance of
> that particular result in future searches? If not, then any pointers on how
> to go about doing that would be very helpful.
>    

Hi Siddhant.
Solr can't do this out of the box.
you would need to use a external field and a custom scoring function to
do something like this.

regards
Ian

> Thanks,
>
> On Thu, Dec 17, 2009 at 7:50 PM, Paul Libbrecht<[hidden email]>  wrote:
>
>    
>> What can it mean to "adapt to user clicks" ? Quite many things in my head.
>> Do you have maybe a citation that inspires you here?
>>
>> paul
>>
>>
>> Le 17-déc.-09 à 13:52, Siddhant Goel a écrit :
>>
>>
>>   Does Solr provide adaptive searching? Can it adapt to user clicks within
>>      
>>> the
>>> search results it provides? Or that has to be done externally?
>>>
>>>        
>>
>>      
>
>    

Reply | Threaded
Open this post in threaded view
|

Re: Adaptive search?

Lance Norskog-2
Solr does have the ExternalFileField available. You could track
existing clicks from the container search log and generate a file to
be used with ExternalFileField.

http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html

In the solr source, trunk/src/test/test-files/solr/conf/schema11.xml
and schema-trie.xml show how to use it.

On Mon, Dec 21, 2009 at 12:39 PM, Ian Holsman <[hidden email]> wrote:

> On 12/18/09 2:46 AM, Siddhant Goel wrote:
>>
>> Let say we have a search engine (a simple front end - web app kind of a
>> thing - responsible for querying Solr and then displaying the results in a
>> human readable form) based on Solr. If a user searches for something, gets
>> quite a few search results, and then clicks on one such result - is there
>> any mechanism by which we can notify Solr to boost the score/relevance of
>> that particular result in future searches? If not, then any pointers on
>> how
>> to go about doing that would be very helpful.
>>
>
> Hi Siddhant.
> Solr can't do this out of the box.
> you would need to use a external field and a custom scoring function to do
> something like this.
>
> regards
> Ian
>>
>> Thanks,
>>
>> On Thu, Dec 17, 2009 at 7:50 PM, Paul Libbrecht<[hidden email]>
>>  wrote:
>>
>>
>>>
>>> What can it mean to "adapt to user clicks" ? Quite many things in my
>>> head.
>>> Do you have maybe a citation that inspires you here?
>>>
>>> paul
>>>
>>>
>>> Le 17-déc.-09 à 13:52, Siddhant Goel a écrit :
>>>
>>>
>>>  Does Solr provide adaptive searching? Can it adapt to user clicks within
>>>
>>>>
>>>> the
>>>> search results it provides? Or that has to be done externally?
>>>>
>>>>
>>>
>>>
>>
>>
>
>



--
Lance Norskog
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Adaptive search?

Ryan Kennedy-3
On Mon, Dec 21, 2009 at 3:36 PM, Lance Norskog <[hidden email]> wrote:
> Solr does have the ExternalFileField available. You could track
> existing clicks from the container search log and generate a file to
> be used with ExternalFileField.
>
> http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html
>
> In the solr source, trunk/src/test/test-files/solr/conf/schema11.xml
> and schema-trie.xml show how to use it.

This approach will be limited to applying a "global" rank to all the
documents, which may have some unintended consequences. The most
popular document in your index will be the most popular, even for
queries for which it was never clicked on. We've currently been
working on this problem in our own implementation and implemented it
using a FunctionQuery (http://wiki.apache.org/solr/FunctionQuery). We
create a ValueSourceParser and hook it into our Solr config:

    <valueSourceParser name="qpop" class="QueryPopularity">
        <str name="popfile">/path/to/popularity_file.xml</str>
    </valueSourceParser>

Then we use the new function in our request handler(s):

    <requestHandler name="..." class="...">
        ...
        <str name="bf">
            qpop(id)
        </str>
    </requestHandler>

The QueryPopularity class takes the current (normalized) query and
indexes into popularity_file.xml to find out what document IDs (it
uses the "id" field because that's what we specified in the arguments
to "qpop", you could use any field you want) are popular for the
current query. Documents which are popular, get a score greater than
zero proportional to their popularity. We do offline processing every
night to build the mappings of query -> popular ID and push that file
to our machines. QueryPopularity has a background thread, which
periodically refreshes the in-memory copy of the XML file's contents.

The main difference is that this is a two-level hash (query -> id ->
score), whereas the ExternalFileField appears to be a one-level hash
(id -> score).

Ryan
Reply | Threaded
Open this post in threaded view
|

Re: Adaptive search?

Siddhant Goel
On Tue, Dec 22, 2009 at 12:01 PM, Ryan Kennedy <[hidden email]> wrote:

> This approach will be limited to applying a "global" rank to all the
> documents, which may have some unintended consequences. The most
> popular document in your index will be the most popular, even for
> queries for which it was never clicked on.


Right. Makes so much sense. Thanks for sharing.

--
- Siddhant
Reply | Threaded
Open this post in threaded view
|

Re: Adaptive search?

Lance Norskog-2
Nice!

Siddhant: Another problem to watch out for is the feedback problem:
someone clicks on a link and it automatically becomes more
interesting, so someone else clicks, and it gets even more
interesting... So you need some kind of suppression. For example, as
individual clicks get older, you can push them down. Or you can put a
cap on the number of clicks used to rank the query.

On Tue, Dec 22, 2009 at 2:36 AM, Siddhant Goel <[hidden email]> wrote:

> On Tue, Dec 22, 2009 at 12:01 PM, Ryan Kennedy <[hidden email]> wrote:
>
>> This approach will be limited to applying a "global" rank to all the
>> documents, which may have some unintended consequences. The most
>> popular document in your index will be the most popular, even for
>> queries for which it was never clicked on.
>
>
> Right. Makes so much sense. Thanks for sharing.
>
> --
> - Siddhant
>



--
Lance Norskog
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Adaptive search?

Shalin Shekhar Mangar
On Wed, Dec 23, 2009 at 4:09 AM, Lance Norskog <[hidden email]> wrote:

> Nice!
>
> Siddhant: Another problem to watch out for is the feedback problem:
> someone clicks on a link and it automatically becomes more
> interesting, so someone else clicks, and it gets even more
> interesting... So you need some kind of suppression. For example, as
> individual clicks get older, you can push them down. Or you can put a
> cap on the number of clicks used to rank the query.
>
>
We use clicks/views instead of just clicks to avoid this problem.

--
Regards,
Shalin Shekhar Mangar.
Reply | Threaded
Open this post in threaded view
|

Re: Adaptive search?

Otis Gospodnetic-2
Shalin,

 
----- Original Message ----

> From: Shalin Shekhar Mangar <[hidden email]>
> To: [hidden email]
> Sent: Wed, December 23, 2009 2:45:21 AM
> Subject: Re: Adaptive search?
>
> On Wed, Dec 23, 2009 at 4:09 AM, Lance Norskog wrote:
>
> > Nice!
> >
> > Siddhant: Another problem to watch out for is the feedback problem:
> > someone clicks on a link and it automatically becomes more
> > interesting, so someone else clicks, and it gets even more
> > interesting... So you need some kind of suppression. For example, as
> > individual clicks get older, you can push them down. Or you can put a
> > cap on the number of clicks used to rank the query.
> >
> >
> We use clicks/views instead of just clicks to avoid this problem.

Doesn't a click imply a view?  You click to view.  I must be missing something...

Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
Reply | Threaded
Open this post in threaded view
|

Re: Adaptive search?

Shalin Shekhar Mangar
On Fri, Jan 8, 2010 at 3:41 AM, Otis Gospodnetic <[hidden email]
> wrote:

>
> ----- Original Message ----
>
> > From: Shalin Shekhar Mangar <[hidden email]>
> > To: [hidden email]
> > Sent: Wed, December 23, 2009 2:45:21 AM
> > Subject: Re: Adaptive search?
> >
> > On Wed, Dec 23, 2009 at 4:09 AM, Lance Norskog wrote:
> >
> > > Nice!
> > >
> > > Siddhant: Another problem to watch out for is the feedback problem:
> > > someone clicks on a link and it automatically becomes more
> > > interesting, so someone else clicks, and it gets even more
> > > interesting... So you need some kind of suppression. For example, as
> > > individual clicks get older, you can push them down. Or you can put a
> > > cap on the number of clicks used to rank the query.
> > >
> > >
> > We use clicks/views instead of just clicks to avoid this problem.
>
> Doesn't a click imply a view?  You click to view.  I must be missing
> something...
>
>
I was talking about boosting documents using past popularity. So a user
searches for X and gets 10 results. This view is recorded for each of the 10
documents and added to the index later. If a user clicks on result #2, the
click is recorded for doc #2 and added to index. We boost using clicks/view.

--
Regards,
Shalin Shekhar Mangar.
Reply | Threaded
Open this post in threaded view
|

Re: Adaptive search?

Ravi Gidwani
Shalin:
           Can you point me to pages/resources that talk about this approach
in details ? OR can you provide more details on the schema and the
function(?) used for ranking the documents.

Thanks,
~Ravi.

On Mon, Jan 11, 2010 at 1:00 AM, Shalin Shekhar Mangar <
[hidden email]> wrote:

> On Fri, Jan 8, 2010 at 3:41 AM, Otis Gospodnetic <
> [hidden email]
> > wrote:
>
> >
> > ----- Original Message ----
> >
> > > From: Shalin Shekhar Mangar <[hidden email]>
> > > To: [hidden email]
> > > Sent: Wed, December 23, 2009 2:45:21 AM
> > > Subject: Re: Adaptive search?
> > >
> > > On Wed, Dec 23, 2009 at 4:09 AM, Lance Norskog wrote:
> > >
> > > > Nice!
> > > >
> > > > Siddhant: Another problem to watch out for is the feedback problem:
> > > > someone clicks on a link and it automatically becomes more
> > > > interesting, so someone else clicks, and it gets even more
> > > > interesting... So you need some kind of suppression. For example, as
> > > > individual clicks get older, you can push them down. Or you can put a
> > > > cap on the number of clicks used to rank the query.
> > > >
> > > >
> > > We use clicks/views instead of just clicks to avoid this problem.
> >
> > Doesn't a click imply a view?  You click to view.  I must be missing
> > something...
> >
> >
> I was talking about boosting documents using past popularity. So a user
> searches for X and gets 10 results. This view is recorded for each of the
> 10
> documents and added to the index later. If a user clicks on result #2, the
> click is recorded for doc #2 and added to index. We boost using
> clicks/view.
>
> --
> Regards,
> Shalin Shekhar Mangar.
>
Reply | Threaded
Open this post in threaded view
|

Re: Adaptive search?

hossman
In reply to this post by Shalin Shekhar Mangar

: I was talking about boosting documents using past popularity. So a user
: searches for X and gets 10 results. This view is recorded for each of the 10
: documents and added to the index later. If a user clicks on result #2, the
: click is recorded for doc #2 and added to index. We boost using clicks/view.

FWIW: I've observed three problems with this type of metric...

1) "render" vs "view" ... what you are calling a "view" is really a
"rendering" -- you are sending the data back to include the item in the
list of 10 items on the page, and the brwoser is rendering it, but that
doesn't mean the users is actaully "viewing" it -- particularly in a
webpage type situation where only the first 3-5 results might actually
appear "above the fold" and the user has to scroll to see the rest.  Even
in a smaller UI element (like a left or right nav info box, there's no
garuntee that the user acctually "views" any of the items, which can bias
things.

2) It doesn't take into account people who click on a result, decide it's
terrible, hit the back arrow and click on a differnet result -- both of
those wind up scoring "equally".  Some really complex session+click
analysis can overcome this, but not a lot of people have the resources to
do that all the time.

3) ignoring #1 and #2 above (because i havne't found many better options)
you face the popularity problem -- or what my coworkers and i use to call
the "TRL Problem" back in the 90s:  MTV's Total Request Live was a Top X
countdown show of videos, featuring hte most popular videos of the week
based on requests -- but it was also the number one show on the network,
occupying something like 4/24 broadcast hours of every day, when there was
only a total of 6/24 hours that actaully showed music videoes.  So for
them ost part the only videos peopel ever saw were on TRL, so those were
the only videos that ever got requested.

In a nutshell: once something becomes "popular" and is what everybody
sees, it stays popular, because it's what everybody sees and they don't
know that there is better stuff out there.

Even if everyone looks at the full list of results and actaully reads all
of the first 10 summaries, in the absense of ay other bias their
inclination is going to be to assume #1 is the best.  So they might click
on that even if another result on the list appears better bassed on their
opinion.

A variation that i did some experiments with, but never really refined
because i didn't have the time/energy to really go to town on it, is to
weight the "clicks" based on position:  a click on item #1 whould't be
worth anything -- it's hte number one result, the expectation is that it
better get clicked or something is wrong.  A click on #2 is worth
soemthing to that item, and a click on #3 is worth more to that item, and
so on ... so that if the #9 item gets a click, that's huge.  To do it
right, I think what you really want to do is penalize items that get views
but no clicks -- because if someone loads up resuolts 1-10, and doesn't
click on any of them, that should be a vote in favor of moving all of them
"down" and moving item #11 up (even though it got no views or clicks)

But like i said: i never experimented with this idea enough to come up
with a good formula, or verify that the idea was sound.

-Hoss