Graph Traversal Question

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Graph Traversal Question

Grant Ingersoll-2
Hi,

I'm playing around with the new Graph Traversal/GatherNodes capabilities in
Solr 6.  I've been indexing Yago facts (
http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/downloads/)
which give me triples of something like subject-relationship-object (United
States -> hasCapital -> Washington DC)

My documents look like:
subject: string
relationship: string
object: string

I can do a simple gatherNodes like
http://localhost:8983/solr/default/graph?expr=gatherNodes(default,
walk="United_States->subject", gather="object") and get back the objects
that relate to the subject.  However, I don't see any way to capture what
the relationship is in the response.  IOW, the request above would just
return a node of "Washington DC", but it doesn't tell me the relationship
(i.e. I'd like to get Wash DC and hasCapital back somehow).  Is there
anyway to expand the "gather" or otherwise mark up the nodes returned with
additional field attributes or maybe get additional graph info back?

Thanks,
Grant
Reply | Threaded
Open this post in threaded view
|

Re: Graph Traversal Question

Yonik Seeley
You can get the nodes that to came from by adding trackTraversal=true

A cut'n'paste example from my Lucene/Solr Revolution slides:

curl $URL -d 'expr=gatherNodes(reviews,
   search(reviews, q="user_s:Yonik AND rating_i:5",
          fl="book_s,user_s,rating_i",sort="user_s asc"),
   walk="book_s->book_s",
   gather="user_s",
   fq="rating_i:[4 TO *] -user_s:Yonik",
   trackTraversal=true )'

{"result-set":{"docs":[
{"node":"Haruka","collection":"reviews","field":"user_s","ancestors":["book1"],"level":1},
{"node":"Maria","collection":"reviews","field":"user_s","ancestors":["book2"],"level":1},
{"EOF":true,"RESPONSE_TIME":22}]}}

-Yonik


On Tue, Oct 25, 2016 at 5:57 PM, Grant Ingersoll <[hidden email]> wrote:

> Hi,
>
> I'm playing around with the new Graph Traversal/GatherNodes capabilities in
> Solr 6.  I've been indexing Yago facts (
> http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/downloads/)
> which give me triples of something like subject-relationship-object (United
> States -> hasCapital -> Washington DC)
>
> My documents look like:
> subject: string
> relationship: string
> object: string
>
> I can do a simple gatherNodes like
> http://localhost:8983/solr/default/graph?expr=gatherNodes(default,
> walk="United_States->subject", gather="object") and get back the objects
> that relate to the subject.  However, I don't see any way to capture what
> the relationship is in the response.  IOW, the request above would just
> return a node of "Washington DC", but it doesn't tell me the relationship
> (i.e. I'd like to get Wash DC and hasCapital back somehow).  Is there
> anyway to expand the "gather" or otherwise mark up the nodes returned with
> additional field attributes or maybe get additional graph info back?
>
> Thanks,
> Grant
Reply | Threaded
Open this post in threaded view
|

Re: Graph Traversal Question

Joel Bernstein
Because the edges are unique on the subject->object there isn't currently a
way to capture the relationship. Aggregations can be rolled up on numeric
fields and as Yonik mentioned you can track the ancestor.

It would be fairly easy to track the relationship by adding a relationship
array that would correspond with the ancestors array for example:

{"result-set":{"docs":[
{"node":"Haruka","collection":"reviews","field":"user_s","ancestors":["book1"],
"relationships":["author"],   "level":1},
{"node":"Maria","collection":"reviews","field":"user_s","
ancestors":["book2"], "relationships":["author"], "level":1},
{"EOF":true,"RESPONSE_TIME":22}]}}

Joel Bernstein
http://joelsolr.blogspot.com/

On Tue, Oct 25, 2016 at 6:26 PM, Yonik Seeley <[hidden email]> wrote:

> You can get the nodes that to came from by adding trackTraversal=true
>
> A cut'n'paste example from my Lucene/Solr Revolution slides:
>
> curl $URL -d 'expr=gatherNodes(reviews,
>    search(reviews, q="user_s:Yonik AND rating_i:5",
>           fl="book_s,user_s,rating_i",sort="user_s asc"),
>    walk="book_s->book_s",
>    gather="user_s",
>    fq="rating_i:[4 TO *] -user_s:Yonik",
>    trackTraversal=true )'
>
> {"result-set":{"docs":[
> {"node":"Haruka","collection":"reviews","field":"user_s","
> ancestors":["book1"],"level":1},
> {"node":"Maria","collection":"reviews","field":"user_s","
> ancestors":["book2"],"level":1},
> {"EOF":true,"RESPONSE_TIME":22}]}}
>
> -Yonik
>
>
> On Tue, Oct 25, 2016 at 5:57 PM, Grant Ingersoll <[hidden email]>
> wrote:
> > Hi,
> >
> > I'm playing around with the new Graph Traversal/GatherNodes capabilities
> in
> > Solr 6.  I've been indexing Yago facts (
> > http://www.mpi-inf.mpg.de/departments/databases-and-
> information-systems/research/yago-naga/yago/downloads/)
> > which give me triples of something like subject-relationship-object
> (United
> > States -> hasCapital -> Washington DC)
> >
> > My documents look like:
> > subject: string
> > relationship: string
> > object: string
> >
> > I can do a simple gatherNodes like
> > http://localhost:8983/solr/default/graph?expr=gatherNodes(default,
> > walk="United_States->subject", gather="object") and get back the objects
> > that relate to the subject.  However, I don't see any way to capture what
> > the relationship is in the response.  IOW, the request above would just
> > return a node of "Washington DC", but it doesn't tell me the relationship
> > (i.e. I'd like to get Wash DC and hasCapital back somehow).  Is there
> > anyway to expand the "gather" or otherwise mark up the nodes returned
> with
> > additional field attributes or maybe get additional graph info back?
> >
> > Thanks,
> > Grant
>
Reply | Threaded
Open this post in threaded view
|

Re: Graph Traversal Question

Grant Ingersoll-2
In reply to this post by Yonik Seeley
On Tue, Oct 25, 2016 at 6:26 PM Yonik Seeley <[hidden email]> wrote:

> You can get the nodes that to came from by adding trackTraversal=true
>

Yeah, I've tried that.  It's not quite what I want.  That just gets me the
"subject".

What I'm trying to do is more akin to what a triple store does.

I _can_ do things like filter on the relationship, which is a good start,
but I want the relationship and the object together so that I can do
downstream work on it.

In your example below it would be akin to injecting the rating onto those
responses as well, not just in the 'fq'.


>
> A cut'n'paste example from my Lucene/Solr Revolution slides:
>
> curl $URL -d 'expr=gatherNodes(reviews,
>    search(reviews, q="user_s:Yonik AND rating_i:5",
>           fl="book_s,user_s,rating_i",sort="user_s asc"),
>    walk="book_s->book_s",
>    gather="user_s",
>    fq="rating_i:[4 TO *] -user_s:Yonik",
>    trackTraversal=true )'
>
> {"result-set":{"docs":[
>
> {"node":"Haruka","collection":"reviews","field":"user_s","ancestors":["book1"],"level":1},
>
> {"node":"Maria","collection":"reviews","field":"user_s","ancestors":["book2"],"level":1},
> {"EOF":true,"RESPONSE_TIME":22}]}}
>
> -Yonik
>
>
> On Tue, Oct 25, 2016 at 5:57 PM, Grant Ingersoll <[hidden email]>
> wrote:
> > Hi,
> >
> > I'm playing around with the new Graph Traversal/GatherNodes capabilities
> in
> > Solr 6.  I've been indexing Yago facts (
> >
> http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/downloads/
> )
> > which give me triples of something like subject-relationship-object
> (United
> > States -> hasCapital -> Washington DC)
> >
> > My documents look like:
> > subject: string
> > relationship: string
> > object: string
> >
> > I can do a simple gatherNodes like
> > http://localhost:8983/solr/default/graph?expr=gatherNodes(default,
> > walk="United_States->subject", gather="object") and get back the objects
> > that relate to the subject.  However, I don't see any way to capture what
> > the relationship is in the response.  IOW, the request above would just
> > return a node of "Washington DC", but it doesn't tell me the relationship
> > (i.e. I'd like to get Wash DC and hasCapital back somehow).  Is there
> > anyway to expand the "gather" or otherwise mark up the nodes returned
> with
> > additional field attributes or maybe get additional graph info back?
> >
> > Thanks,
> > Grant
>
Reply | Threaded
Open this post in threaded view
|

Re: Graph Traversal Question

Grant Ingersoll-2
In reply to this post by Joel Bernstein
On Tue, Oct 25, 2016 at 6:46 PM Joel Bernstein <[hidden email]> wrote:

> Because the edges are unique on the subject->object there isn't currently a
> way to capture the relationship. Aggregations can be rolled up on numeric
> fields and as Yonik mentioned you can track the ancestor.
>
> It would be fairly easy to track the relationship by adding a relationship
> array that would correspond with the ancestors array for example:
>
> {"result-set":{"docs":[
>
> {"node":"Haruka","collection":"reviews","field":"user_s","ancestors":["book1"],
> "relationships":["author"],   "level":1},
> {"node":"Maria","collection":"reviews","field":"user_s","
> ancestors":["book2"], "relationships":["author"], "level":1},
> {"EOF":true,"RESPONSE_TIME":22}]}}
>

Right, that is what I am after!


>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Tue, Oct 25, 2016 at 6:26 PM, Yonik Seeley <[hidden email]> wrote:
>
> > You can get the nodes that to came from by adding trackTraversal=true
> >
> > A cut'n'paste example from my Lucene/Solr Revolution slides:
> >
> > curl $URL -d 'expr=gatherNodes(reviews,
> >    search(reviews, q="user_s:Yonik AND rating_i:5",
> >           fl="book_s,user_s,rating_i",sort="user_s asc"),
> >    walk="book_s->book_s",
> >    gather="user_s",
> >    fq="rating_i:[4 TO *] -user_s:Yonik",
> >    trackTraversal=true )'
> >
> > {"result-set":{"docs":[
> > {"node":"Haruka","collection":"reviews","field":"user_s","
> > ancestors":["book1"],"level":1},
> > {"node":"Maria","collection":"reviews","field":"user_s","
> > ancestors":["book2"],"level":1},
> > {"EOF":true,"RESPONSE_TIME":22}]}}
> >
> > -Yonik
> >
> >
> > On Tue, Oct 25, 2016 at 5:57 PM, Grant Ingersoll <[hidden email]>
> > wrote:
> > > Hi,
> > >
> > > I'm playing around with the new Graph Traversal/GatherNodes
> capabilities
> > in
> > > Solr 6.  I've been indexing Yago facts (
> > > http://www.mpi-inf.mpg.de/departments/databases-and-
> > information-systems/research/yago-naga/yago/downloads/)
> > > which give me triples of something like subject-relationship-object
> > (United
> > > States -> hasCapital -> Washington DC)
> > >
> > > My documents look like:
> > > subject: string
> > > relationship: string
> > > object: string
> > >
> > > I can do a simple gatherNodes like
> > > http://localhost:8983/solr/default/graph?expr=gatherNodes(default,
> > > walk="United_States->subject", gather="object") and get back the
> objects
> > > that relate to the subject.  However, I don't see any way to capture
> what
> > > the relationship is in the response.  IOW, the request above would just
> > > return a node of "Washington DC", but it doesn't tell me the
> relationship
> > > (i.e. I'd like to get Wash DC and hasCapital back somehow).  Is there
> > > anyway to expand the "gather" or otherwise mark up the nodes returned
> > with
> > > additional field attributes or maybe get additional graph info back?
> > >
> > > Thanks,
> > > Grant
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Graph Traversal Question

Yonik Seeley
In reply to this post by Grant Ingersoll-2
On Wed, Oct 26, 2016 at 7:13 AM, Grant Ingersoll <[hidden email]> wrote:
> On Tue, Oct 25, 2016 at 6:26 PM Yonik Seeley <[hidden email]> wrote:
>
> In your example below it would be akin to injecting the rating onto those
> responses as well, not just in the 'fq'.

Gotcha... Yeah, I remember wondering how to do that myself.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: Graph Traversal Question

Grant Ingersoll-2
The other way to think about is: I want to put labels on the edges.  In my
case, the label is the relationship, in your case, the label is the rating
or author.

On Wed, Oct 26, 2016 at 7:26 AM Yonik Seeley <[hidden email]> wrote:

> On Wed, Oct 26, 2016 at 7:13 AM, Grant Ingersoll <[hidden email]>
> wrote:
> > On Tue, Oct 25, 2016 at 6:26 PM Yonik Seeley <[hidden email]> wrote:
> >
> > In your example below it would be akin to injecting the rating onto those
> > responses as well, not just in the 'fq'.
>
> Gotcha... Yeah, I remember wondering how to do that myself.
>
> -Yonik
>
Reply | Threaded
Open this post in threaded view
|

Re: Graph Traversal Question

Joel Bernstein
Grant, can you describe your use case? Currently we can filter on the
relationship using a filter query. So I was wondering what use case would
involve retrieving the relationship. Are you looking to discover what
relationships are available? One of the assumptions I made was that users
would know what relationships they wanted to traverse.



Joel Bernstein
http://joelsolr.blogspot.com/

On Wed, Oct 26, 2016 at 9:39 AM, Grant Ingersoll <[hidden email]>
wrote:

> The other way to think about is: I want to put labels on the edges.  In my
> case, the label is the relationship, in your case, the label is the rating
> or author.
>
> On Wed, Oct 26, 2016 at 7:26 AM Yonik Seeley <[hidden email]> wrote:
>
> > On Wed, Oct 26, 2016 at 7:13 AM, Grant Ingersoll <[hidden email]>
> > wrote:
> > > On Tue, Oct 25, 2016 at 6:26 PM Yonik Seeley <[hidden email]>
> wrote:
> > >
> > > In your example below it would be akin to injecting the rating onto
> those
> > > responses as well, not just in the 'fq'.
> >
> > Gotcha... Yeah, I remember wondering how to do that myself.
> >
> > -Yonik
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Graph Traversal Question

Grant Ingersoll-2
On Wed, Oct 26, 2016 at 10:46 AM Joel Bernstein <[hidden email]> wrote:

> Grant, can you describe your use case? Currently we can filter on the
> relationship using a filter query. So I was wondering what use case would
> involve retrieving the relationship. Are you looking to discover what
> relationships are available? One of the assumptions I made was that users
> would know what relationships they wanted to traverse.
>
>
Some of this is admittedly a thought experiment of what's possible, but I
think when dealing w/ graph operations it's pretty natural to use edge
attributes as part of your calculation.  The most obvious use case of that
is a weighted graph where the edge attribute is a numerical weight (e.g. in
Yonik's example: sort/rank by rating).  For me, I'm exploring how to use KB
data (Yago, which is basically RDF triples) as part of relevance and to
answer questions.  These are commonly done in a triple store (RDF engine),
but w/ this graph stuff in Solr, I think it could be possible to do in Solr
(and quite simply at that), which significantly simplifies the overall
system.


>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Wed, Oct 26, 2016 at 9:39 AM, Grant Ingersoll <[hidden email]>
> wrote:
>
> > The other way to think about is: I want to put labels on the edges.  In
> my
> > case, the label is the relationship, in your case, the label is the
> rating
> > or author.
> >
> > On Wed, Oct 26, 2016 at 7:26 AM Yonik Seeley <[hidden email]> wrote:
> >
> > > On Wed, Oct 26, 2016 at 7:13 AM, Grant Ingersoll <[hidden email]>
> > > wrote:
> > > > On Tue, Oct 25, 2016 at 6:26 PM Yonik Seeley <[hidden email]>
> > wrote:
> > > >
> > > > In your example below it would be akin to injecting the rating onto
> > those
> > > > responses as well, not just in the 'fq'.
> > >
> > > Gotcha... Yeah, I remember wondering how to do that myself.
> > >
> > > -Yonik
> > >
> >
>