Solr streaming expression - options for Full Outer Join

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr streaming expression - options for Full Outer Join

GaneshSe
I would to perform full outer join  (emit documents from both left and
right and if there are common combine them) with solr streaming decorators
on two collections and "update" it to a new destination collection. I see
merge decorator option exists, but this seems to return two JSON document
for same id field from these two collections instead of one combined
document. The leftOuterJoin seems to do this combining correctly by
returning a document with matched "id" field into one document. But
leftOuterJoin is not exactly what i want to do, i want to full outer join.
Because merge returns two documents with same id, in the destination
collection, only second document exists, not both. Is there a way to
achieve what i am trying to do? Any help appreciated. Just to give more
details, here is something I am doing:

commit(
    destinationCollection ,
    batchSize=1,
   update(destinationCollection,
        batchSize=2,
        merge(search(col1,q=id:5,fl="id, collection1_field1 ", sort="duns
asc",qt="/export"),search(col2,q=id:5,fl="id, collection2_field2 ",
sort="id asc",qt="/export"),on="id asc")
))


Merge Response
{
"result-set": { "docs": [ { "id": "5", "collection1_field1": 64 }, {
"duns": "5", "collection2_field2": 0 }, { "EOF": true, "RESPONSE_TIME": 17
} ] } }

But i need just one document in the response with combined fields
Reply | Threaded
Open this post in threaded view
|

Re: Solr streaming expression - options for Full Outer Join

GaneshSe
one typo in the above streaming expression sort, it is "id asc" in the
collection col1

On Tue, Feb 13, 2018 at 1:38 PM, Ganesh Sethuraman <[hidden email]>
wrote:

>
> I would to perform full outer join  (emit documents from both left and
> right and if there are common combine them) with solr streaming decorators
> on two collections and "update" it to a new destination collection. I see
> merge decorator option exists, but this seems to return two JSON document
> for same id field from these two collections instead of one combined
> document. The leftOuterJoin seems to do this combining correctly by
> returning a document with matched "id" field into one document. But
> leftOuterJoin is not exactly what i want to do, i want to full outer join.
> Because merge returns two documents with same id, in the destination
> collection, only second document exists, not both. Is there a way to
> achieve what i am trying to do? Any help appreciated. Just to give more
> details, here is something I am doing:
>
> commit(
>     destinationCollection ,
>     batchSize=1,
>    update(destinationCollection,
>         batchSize=2,
>         merge(search(col1,q=id:5,fl="id, collection1_field1 ", sort="duns
> asc",qt="/export"),search(col2,q=id:5,fl="id, collection2_field2 ",
> sort="id asc",qt="/export"),on="id asc")
> ))
>
>
> Merge Response
> {
> "result-set": { "docs": [ { "id": "5", "collection1_field1": 64 }, {
> "duns": "5", "collection2_field2": 0 }, { "EOF": true, "RESPONSE_TIME": 17
> } ] } }
>
> But i need just one document in the response with combined fields
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr streaming expression - options for Full Outer Join

GaneshSe
In reply to this post by GaneshSe
Also want to add, i am trying to do this on Solr 7.2.1

On Tue, Feb 13, 2018 at 1:38 PM, Ganesh Sethuraman <[hidden email]>
wrote:

>
> I would to perform full outer join  (emit documents from both left and
> right and if there are common combine them) with solr streaming decorators
> on two collections and "update" it to a new destination collection. I see
> merge decorator option exists, but this seems to return two JSON document
> for same id field from these two collections instead of one combined
> document. The leftOuterJoin seems to do this combining correctly by
> returning a document with matched "id" field into one document. But
> leftOuterJoin is not exactly what i want to do, i want to full outer join.
> Because merge returns two documents with same id, in the destination
> collection, only second document exists, not both. Is there a way to
> achieve what i am trying to do? Any help appreciated. Just to give more
> details, here is something I am doing:
>
> commit(
>     destinationCollection ,
>     batchSize=1,
>    update(destinationCollection,
>         batchSize=2,
>         merge(search(col1,q=id:5,fl="id, collection1_field1 ", sort="duns
> asc",qt="/export"),search(col2,q=id:5,fl="id, collection2_field2 ",
> sort="id asc",qt="/export"),on="id asc")
> ))
>
>
> Merge Response
> {
> "result-set": { "docs": [ { "id": "5", "collection1_field1": 64 }, {
> "duns": "5", "collection2_field2": 0 }, { "EOF": true, "RESPONSE_TIME": 17
> } ] } }
>
> But i need just one document in the response with combined fields
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr streaming expression - options for Full Outer Join

GaneshSe
In reply to this post by GaneshSe
Any help solr streaming expression option is greatly appreciated. Please help



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr streaming expression - options for Full Outer Join

Joel Bernstein
If you aren't getting the join functionality you want with the current join
implementations you could try the reduce function using the group
operation.

Here is the sample syntax:

reduce(search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_s asc, a_f asc"),
       by="a_s",
       group(sort="a_f desc", n="4")
)


The is a basic map/reduce grouping operation.


Joel Bernstein
http://joelsolr.blogspot.com/

On Sun, Feb 18, 2018 at 6:24 PM, GaneshSe <[hidden email]> wrote:

> Any help solr streaming expression option is greatly appreciated. Please
> help
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr streaming expression - options for Full Outer Join

Joel Bernstein
I forgot to mention, in order to do a join you would merge the streams
together that you wanted to join. Then reduce by the join key.
This is basic structure:

reduce(merge(search(), search()))

Joel Bernstein
http://joelsolr.blogspot.com/

On Sun, Feb 18, 2018 at 10:45 PM, Joel Bernstein <[hidden email]> wrote:

> If you aren't getting the join functionality you want with the current
> join implementations you could try the reduce function using the group
> operation.
>
> Here is the sample syntax:
>
> reduce(search(collection1, q=*:*, fl="id,a_s,a_i,a_f", sort="a_s asc, a_f asc"),
>        by="a_s",
>        group(sort="a_f desc", n="4")
> )
>
>
> The is a basic map/reduce grouping operation.
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Sun, Feb 18, 2018 at 6:24 PM, GaneshSe <[hidden email]> wrote:
>
>> Any help solr streaming expression option is greatly appreciated. Please
>> help
>>
>>
>>
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>
>
>