Use cases for the graph streams

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Use cases for the graph streams

Nightingale, Jonathan A (US)
This is kind of  broad question, but I was playing with the graph streams and was having trouble making the tools work for what I wanted to do. I'm wondering if the use case for the graph streams really supports standard graph queries you might use with Gemlin or the like? I ask because right now we have two implementations of our data storage to support these two ways of looking at it, the standard query and the semantic filtering.

The usecases I usually see for the graph streams always seem to be limited to one link traversal for finding things related to nodes gathered from a query. But even with that it wasn't clear the best way to do things with lists of docvalues. So for example if you wanted to represent a node that had many doc values I had to use cross products to make a node for each doc value. The traversal didn't allow for that kind of node linking inherently it seemed.

So my question really is (and maybe this is not the place for this) what is the intent of these graph features and what is the goal for them in the future? I was really hoping at one point to only use solr for our product but it didn't seem feasible, at least not easily.

Thanks for all your help
Jonathan

Jonathan Nightingale
GXP Solutions Engineer
(office) 315 838 2273
(cell) 315 271 0688

Reply | Threaded
Open this post in threaded view
|

Re: Use cases for the graph streams

Joel Bernstein
Good question. Let me first point to an interesting example in the Visual
Guide to Streaming Expressions and Math Expressions:

https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/search-sample.adoc#nodes

This example gets to the heart of the core use case for the nodes
expression which is to discover the relationships between nodes in a graph.
So it's a discovery tool to learn something new about the data that you
can't see without having this specific ability of walking the nodes in a
graph.

In the broader context the nodes expression is part of a much wider set of
tools that allow people to use Solr to explore the relationships in their
data. This is described here:

https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/math-expressions.adoc

The goal of all this is to move search engines beyond basic aggregations to
study the correlations and relationships within the data set.

Graph traversal is part of this broader goal which will get developed more
over time. I'd be interested in hearing more about specific graph use cases
that you're interested in solving.

Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, May 20, 2020 at 12:32 PM Nightingale, Jonathan A (US) <
[hidden email]> wrote:

> This is kind of  broad question, but I was playing with the graph streams
> and was having trouble making the tools work for what I wanted to do. I'm
> wondering if the use case for the graph streams really supports standard
> graph queries you might use with Gemlin or the like? I ask because right
> now we have two implementations of our data storage to support these two
> ways of looking at it, the standard query and the semantic filtering.
>
> The usecases I usually see for the graph streams always seem to be limited
> to one link traversal for finding things related to nodes gathered from a
> query. But even with that it wasn't clear the best way to do things with
> lists of docvalues. So for example if you wanted to represent a node that
> had many doc values I had to use cross products to make a node for each doc
> value. The traversal didn't allow for that kind of node linking inherently
> it seemed.
>
> So my question really is (and maybe this is not the place for this) what
> is the intent of these graph features and what is the goal for them in the
> future? I was really hoping at one point to only use solr for our product
> but it didn't seem feasible, at least not easily.
>
> Thanks for all your help
> Jonathan
>
> Jonathan Nightingale
> GXP Solutions Engineer
> (office) 315 838 2273
> (cell) 315 271 0688
>
>
Reply | Threaded
Open this post in threaded view
|

RE: Use cases for the graph streams

Nightingale, Jonathan A (US)
Without getting too in the weeks with our product, we have a bunch of solr records that represent entities and their relationships to other entities or files. For example a document may describe a bunch of people. We have entries for the people as well as the document. We also have entries that represent connections between these people based on things described in the document.

From those solr records we have a bunch of docIds as references in each record that lets us link them. We build a graph in a separate graph store so we can to traversal on it. We do semantic filtering like find people nodes that are related to people nodes described in this document. That relationship can be expanded to allow for greater walks on the graph, so find everything up to N steps from this person.

We also allow just viewing all nodes on the graph that are connected by N steps from a target node and allow the user to traverse that way to just explore the information as they then shift to another node and display the new subgraph from their new focus node.

So those are the kinds of things I was hoping to do with the gather nodes functions in solr but I couldn't find a simple way to do it.

Jonathan

-----Original Message-----
From: Joel Bernstein <[hidden email]>
Sent: Thursday, May 21, 2020 9:57 AM
To: [hidden email]
Subject: Re: Use cases for the graph streams

*** WARNING ***
EXTERNAL EMAIL -- This message originates from outside our organization.


Good question. Let me first point to an interesting example in the Visual Guide to Streaming Expressions and Math Expressions:

https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/search-sample.adoc#nodes

This example gets to the heart of the core use case for the nodes expression which is to discover the relationships between nodes in a graph.
So it's a discovery tool to learn something new about the data that you can't see without having this specific ability of walking the nodes in a graph.

In the broader context the nodes expression is part of a much wider set of tools that allow people to use Solr to explore the relationships in their data. This is described here:

https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/math-expressions.adoc

The goal of all this is to move search engines beyond basic aggregations to study the correlations and relationships within the data set.

Graph traversal is part of this broader goal which will get developed more over time. I'd be interested in hearing more about specific graph use cases that you're interested in solving.

Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, May 20, 2020 at 12:32 PM Nightingale, Jonathan A (US) < [hidden email]> wrote:

> This is kind of  broad question, but I was playing with the graph
> streams and was having trouble making the tools work for what I wanted
> to do. I'm wondering if the use case for the graph streams really
> supports standard graph queries you might use with Gemlin or the like?
> I ask because right now we have two implementations of our data
> storage to support these two ways of looking at it, the standard query and the semantic filtering.
>
> The usecases I usually see for the graph streams always seem to be
> limited to one link traversal for finding things related to nodes
> gathered from a query. But even with that it wasn't clear the best way
> to do things with lists of docvalues. So for example if you wanted to
> represent a node that had many doc values I had to use cross products
> to make a node for each doc value. The traversal didn't allow for that
> kind of node linking inherently it seemed.
>
> So my question really is (and maybe this is not the place for this)
> what is the intent of these graph features and what is the goal for
> them in the future? I was really hoping at one point to only use solr
> for our product but it didn't seem feasible, at least not easily.
>
> Thanks for all your help
> Jonathan
>
> Jonathan Nightingale
> GXP Solutions Engineer
> (office) 315 838 2273
> (cell) 315 271 0688
>
>