Streaming expression API innerJoin on multi-valued field

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Streaming expression API innerJoin on multi-valued field

Marc Röttig
Dear SOLR users,

I want to use streaming expression innerJoin using a multi-valued field to do the join by equality, that is having any child documents  (of type "child") and
one parent document (of type "parent") join these according to equality of id_s and children_ids

Parent
* id_s = "p123"
* type_s = "parent"
* children_ids_ss = "c1,c2"

Child
* id_s = "c1"
* type_s = "child"

Child
* id_s = "c2"
* type_s = "child"

innerJoin(
   search(collection,q="type_s:child",fl="id_s",sort="id_s ASC"),
   search(collection,q="type_s:parent",fl="id_s,children_ids_ss",sort="id_s ASC"),
   on="id_s=children_ids_ss"
)

This seems to be impossible, I am getting the following exception "java.util.ArrayList cannot be cast to java.lang.Comparable". Using a GraphQuery with from and to
this relationship traversal along multi-valued fields worked (however not between shards, this is why I switched to streaming expressions).

Is there any mechanism to flatten the tuples with the multi-valued field into new tuples with single-valued fields to get the join working ? Or any other tweak.

Note: The relationship between Parent and Child is many-to-many, thus moving the foreign-keys to the children as single-valued fields is not possible.

The issue is related tot he following issue: http://lucene.472066.n3.nabble.com/Using-multi-valued-field-in-solr-cloud-Graph-Traversal-Query-td4324379.html

Thanks a lot in advance for any assistance,
Marc


Dr. Marc Röttig
Software Developer
EMail: [hidden email]
Telefon: +49(0)711. 78 78 29-290
Fax +49(0)711. 78 78 29-10

VICO Research & Consulting GmbH
Friedrich-List-Strasse 46 / 70771 Leinfelden-Echterdingen

Homepage:         www.vico-research.com/
Blog:                       www.vico-research.com/expert-talk
Twitter:                 www.twitter.com/vico_news
Facebook:            www.facebook.com/vico.friend
Sitz der Gesellschaft: Leinfelden-Echterdingen
Amtsgericht Stuttgart, HRB 720896
Geschäftsführer: Marc Trömel

Reply | Threaded
Open this post in threaded view
|

Re: Streaming expression API innerJoin on multi-valued field

Joel Bernstein
The cartesianProduct Stream can be wrapped around the stream with the
multi-value field. The cartesianProduct function is available in Solr 6.6
but since this was a late addition the documentation does not appear to
Solr 7.0.

Here is a link to the docs in github:
https://github.com/apache/lucene-solr/blob/branch_7_0/solr/solr-ref-guide/src/stream-decorators.adoc

The first stream decorator is the docs the cartesianProduct.

Since you can't sort on the multi-valued field though you'll have use a
hashJoin to do the join.


Joel Bernstein
http://joelsolr.blogspot.com/

On Wed, Sep 6, 2017 at 4:08 AM, Marc Röttig <[hidden email]>
wrote:

> Dear SOLR users,
>
> I want to use streaming expression innerJoin using a multi-valued field to
> do the join by equality, that is having any child documents  (of type
> "child") and
> one parent document (of type "parent") join these according to equality of
> id_s and children_ids
>
> Parent
> * id_s = "p123"
> * type_s = "parent"
> * children_ids_ss = "c1,c2"
>
> Child
> * id_s = "c1"
> * type_s = "child"
>
> Child
> * id_s = "c2"
> * type_s = "child"
>
> innerJoin(
>    search(collection,q="type_s:child",fl="id_s",sort="id_s ASC"),
>    search(collection,q="type_s:parent",fl="id_s,children_ids_ss",sort="id_s
> ASC"),
>    on="id_s=children_ids_ss"
> )
>
> This seems to be impossible, I am getting the following exception
> "java.util.ArrayList cannot be cast to java.lang.Comparable". Using a
> GraphQuery with from and to
> this relationship traversal along multi-valued fields worked (however not
> between shards, this is why I switched to streaming expressions).
>
> Is there any mechanism to flatten the tuples with the multi-valued field
> into new tuples with single-valued fields to get the join working ? Or any
> other tweak.
>
> Note: The relationship between Parent and Child is many-to-many, thus
> moving the foreign-keys to the children as single-valued fields is not
> possible.
>
> The issue is related tot he following issue: http://lucene.472066.n3.
> nabble.com/Using-multi-valued-field-in-solr-cloud-Graph-
> Traversal-Query-td4324379.html
>
> Thanks a lot in advance for any assistance,
> Marc
>
>
> Dr. Marc Röttig
> Software Developer
> EMail: [hidden email]
> Telefon: +49(0)711. 78 78 29-290
> Fax +49(0)711. 78 78 29-10
>
> VICO Research & Consulting GmbH
> Friedrich-List-Strasse 46 / 70771 Leinfelden-Echterdingen
>
> Homepage:         www.vico-research.com/
> Blog:                       www.vico-research.com/expert-talk
> Twitter:                 www.twitter.com/vico_news
> Facebook:            www.facebook.com/vico.friend
> Sitz der Gesellschaft: Leinfelden-Echterdingen
> Amtsgericht Stuttgart, HRB 720896
> Geschäftsführer: Marc Trömel
>
>
Reply | Threaded
Open this post in threaded view
|

AW: Streaming expression API innerJoin on multi-valued field

Marc Röttig
In reply to this post by Marc Röttig
Dear Mr. Bernstein, SOLR-users,

thanks a lot for your valuable hint regarding the cartesianProduct operator.

The following streaming expression gives me the desired result tuples:

hashJoin(
  search(collection,q="type_s:child",fl="id_s",sort="id_s ASC",rows=1000000),
  hashed=cartesianProduct(
     search(collection,q="type_s:parent AND id_s:p123",fl="children_ids_ss,id_s",sort="id_s ASC"),
     children_ids_ss
   ),
   on="id_s=children_ids_ss"
)

where the inner

search(collection,q="id_s:p123",fl="children_ids_ss,id_s",sort="id_s ASC")

ideally delivers 1 to a few (say 1000) tuples, hopefully not making the HashJoin
slow or even impossible, and the outer search will yield quite a lot tuples. Which
should be fine though.

Cheers,
Marc


-----Ursprüngliche Nachricht-----
Von: Marc Röttig [mailto:[hidden email]]
Gesendet: Mittwoch, 6. September 2017 10:08
An: [hidden email]
Betreff: Streaming expression API innerJoin on multi-valued field

Dear SOLR users,

I want to use streaming expression innerJoin using a multi-valued field to do the join by equality, that is having any child documents  (of type "child") and one parent document (of type "parent") join these according to equality of id_s and children_ids

Parent
* id_s = "p123"
* type_s = "parent"
* children_ids_ss = "c1,c2"

Child
* id_s = "c1"
* type_s = "child"

Child
* id_s = "c2"
* type_s = "child"

innerJoin(
   search(collection,q="type_s:child",fl="id_s",sort="id_s ASC"),
   search(collection,q="type_s:parent",fl="id_s,children_ids_ss",sort="id_s ASC"),
   on="id_s=children_ids_ss"
)

This seems to be impossible, I am getting the following exception "java.util.ArrayList cannot be cast to java.lang.Comparable". Using a GraphQuery with from and to this relationship traversal along multi-valued fields worked (however not between shards, this is why I switched to streaming expressions).

Is there any mechanism to flatten the tuples with the multi-valued field into new tuples with single-valued fields to get the join working ? Or any other tweak.

Note: The relationship between Parent and Child is many-to-many, thus moving the foreign-keys to the children as single-valued fields is not possible.

The issue is related tot he following issue: http://lucene.472066.n3.nabble.com/Using-multi-valued-field-in-solr-cloud-Graph-Traversal-Query-td4324379.html

Thanks a lot in advance for any assistance, Marc


Dr. Marc Röttig
Software Developer
EMail: [hidden email]
Telefon: +49(0)711. 78 78 29-290
Fax +49(0)711. 78 78 29-10

VICO Research & Consulting GmbH
Friedrich-List-Strasse 46 / 70771 Leinfelden-Echterdingen

Homepage:         www.vico-research.com/
Blog:                       www.vico-research.com/expert-talk
Twitter:                 www.twitter.com/vico_news
Facebook:            www.facebook.com/vico.friend Sitz der Gesellschaft: Leinfelden-Echterdingen Amtsgericht Stuttgart, HRB 720896
Geschäftsführer: Marc Trömel