Related Search

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Related Search

Rick Leir-2
Hi all,

There is an issue 'Create a Related Search Component' which has been
open for some years now.

It has a priority: major.

https://issues.apache.org/jira/browse/SOLR-2080


I discovered it linked from Lucidwork's very useful blog on ecommerce:

https://lucidworks.com/blog/2011/01/25/implementing-the-ecommerce-checklist-with-apache-solr-and-lucidworks/


Did people find a better way to accomplish Related Search? Perhaps MLT
http://wiki.apache.org/solr/MoreLikeThis ?

cheers -- Rick


Reply | Threaded
Open this post in threaded view
|

Re: Related Search

Erick Erickson
Rick:

The priority isn't particularly helpful for two reasons:

1> it's the default so often gets set without intent.
2> what the originator thinks of as major may or may not translate
into someone actually doing work on it.

In this case there's a lot of work that'd need to be done. "some
model" just begs for clarification. In this case it might be a major
feature, but nobody's felt the need or had the time to put into making
it a reality. This is really just a topic for conversation at this
point....

Best,
Erick

On Mon, Oct 24, 2016 at 5:32 PM, Rick Leir <[hidden email]> wrote:

> Hi all,
>
> There is an issue 'Create a Related Search Component' which has been open
> for some years now.
>
> It has a priority: major.
>
> https://issues.apache.org/jira/browse/SOLR-2080
>
>
> I discovered it linked from Lucidwork's very useful blog on ecommerce:
>
> https://lucidworks.com/blog/2011/01/25/implementing-the-ecommerce-checklist-with-apache-solr-and-lucidworks/
>
>
> Did people find a better way to accomplish Related Search? Perhaps MLT
> http://wiki.apache.org/solr/MoreLikeThis ?
>
> cheers -- Rick
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Related Search

Grant Ingersoll-2
In reply to this post by Rick Leir-2
Hi Rick,

I typically do this stuff just by searching a different collection that I
create offline by analyzing query logs and then indexing them and searching.

On Mon, Oct 24, 2016 at 8:32 PM Rick Leir <[hidden email]> wrote:

> Hi all,
>
> There is an issue 'Create a Related Search Component' which has been
> open for some years now.
>
> It has a priority: major.
>
> https://issues.apache.org/jira/browse/SOLR-2080
>
>
> I discovered it linked from Lucidwork's very useful blog on ecommerce:
>
>
> https://lucidworks.com/blog/2011/01/25/implementing-the-ecommerce-checklist-with-apache-solr-and-lucidworks/
>
>
> Did people find a better way to accomplish Related Search? Perhaps MLT
> http://wiki.apache.org/solr/MoreLikeThis ?
>
> cheers -- Rick
>
>
>
Reply | Threaded
Open this post in threaded view
|

RE: Related Search

Markus Jelsma-2
In reply to this post by Rick Leir-2
Indeed, we have similar processes running of which one generates a 'related query collection' which just contains a (normalized) query and its related queries. I would not know how this is even possible without continuously processing query and click logs.

M.
 
 
-----Original message-----

> From:Grant Ingersoll <[hidden email]>
> Sent: Tuesday 25th October 2016 23:51
> To: [hidden email]
> Subject: Re: Related Search
>
> Hi Rick,
>
> I typically do this stuff just by searching a different collection that I
> create offline by analyzing query logs and then indexing them and searching.
>
> On Mon, Oct 24, 2016 at 8:32 PM Rick Leir <[hidden email]> wrote:
>
> > Hi all,
> >
> > There is an issue 'Create a Related Search Component' which has been
> > open for some years now.
> >
> > It has a priority: major.
> >
> > https://issues.apache.org/jira/browse/SOLR-2080
> >
> >
> > I discovered it linked from Lucidwork's very useful blog on ecommerce:
> >
> >
> > https://lucidworks.com/blog/2011/01/25/implementing-the-ecommerce-checklist-with-apache-solr-and-lucidworks/
> >
> >
> > Did people find a better way to accomplish Related Search? Perhaps MLT
> > http://wiki.apache.org/solr/MoreLikeThis ?
> >
> > cheers -- Rick
> >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Related Search

Trey Grainger
Yeah, the approach listed by Grant and Markus is a common approach. I've
worked on systems that mined query logs like this, and it's a good approach
if you have sufficient query logs to pull it off.

There are a lot of linguistic nuances you'll encounter along the way,
including how you disambiguate homonyms and their related terms, identify
synonyms/acronyms as having the same underlying meaning, how you parse and
handle unknown phrases, removing noise present in the query logs, and even
how you weight the strength or relationship between related queries. I gave
a presentation on this topic at Lucene/Solr Revolution in 2015 if you're
interested in learning more about how to build such a system (
http://www.treygrainger.com/posts/presentations/leveraging-lucene-solr-as-a-knowledge-graph-and-intent-engine/
).

Another approach (also referenced in the above presentation), for those
with more of a cold-start problem with query logs, is to mine related terms
and phrases out of the underlying content in the search engine (inverted
index) itself. The Semantic Knowledge Graph that was recently open sourced
by CareerBuilder and contributed back to Solr (disclaimer: I worked on it,
and it's available both a Solr plugin and patch, but it's not ready to be
committed into Solr yet.) enables such a capability. See
https://issues.apache.org/jira/browse/SOLR-9480 for the most current patch.

It is a request handler that can take in any query and discover the most
related other terms to that entire query from the inverted index, sorted by
strength of relationship to that query (it can also traverse from those
terms across fields/relationships to other terms, but that's probably
overkill for the basic related searches use case). Think of it as a way to
run a query and find the most relevant other keywords, as opposed to
finding the most relevant documents.

Using this, you can then either return the related keywords as your related
searches, or you can modify your query to include them and power a
conceptual/semantic search instead of the pure text-based search you
started with. It's effectively a (better) way to implement More Like This,
where instead of taking a document and using tf-idf to extract out the
globally-interesting terms from the document (like MLT), you can instead
use a query to find contextually-relevant keywords across many documents,
score them based upon their similarity to the original query, and then turn
around and use the top most semantically-relevant terms as your related
search(es).

I don't have near-term plans to expose the semantic knowledge graph as a
search component (it's a request handler right now), but once it's finished
that could certainly be done. Just wanted to mention it as another approach
to solve this specific problem.

-Trey Grainger
SVP of Engineering @ Lucidworks
Co-author, Solr in Action



On Wed, Oct 26, 2016 at 1:59 PM, Markus Jelsma <[hidden email]>
wrote:

> Indeed, we have similar processes running of which one generates a
> 'related query collection' which just contains a (normalized) query and its
> related queries. I would not know how this is even possible without
> continuously processing query and click logs.
>
> M.
>
>
> -----Original message-----
> > From:Grant Ingersoll <[hidden email]>
> > Sent: Tuesday 25th October 2016 23:51
> > To: [hidden email]
> > Subject: Re: Related Search
> >
> > Hi Rick,
> >
> > I typically do this stuff just by searching a different collection that I
> > create offline by analyzing query logs and then indexing them and
> searching.
> >
> > On Mon, Oct 24, 2016 at 8:32 PM Rick Leir <[hidden email]> wrote:
> >
> > > Hi all,
> > >
> > > There is an issue 'Create a Related Search Component' which has been
> > > open for some years now.
> > >
> > > It has a priority: major.
> > >
> > > https://issues.apache.org/jira/browse/SOLR-2080
> > >
> > >
> > > I discovered it linked from Lucidwork's very useful blog on ecommerce:
> > >
> > >
> > > https://lucidworks.com/blog/2011/01/25/implementing-the-
> ecommerce-checklist-with-apache-solr-and-lucidworks/
> > >
> > >
> > > Did people find a better way to accomplish Related Search? Perhaps MLT
> > > http://wiki.apache.org/solr/MoreLikeThis ?
> > >
> > > cheers -- Rick
> > >
> > >
> > >
> >
>