Mixing simple and nested docs in same update?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Mixing simple and nested docs in same update?

Jan Høydahl / Cominvent
Hi,

We index several large nested documents. We found that querying the data behaves differently depending on how the documents are indexed.

To reproduce:

solr start
solr create -c nested
# Index one plain document, “friend" and a nested one, “mother” and “daughter”, in same request:
curl localhost:8983/solr/nested/update -d ‘
 <add>
   <doc>
     <field name="id">friend</field>
     <field name="type">other</field>
   </doc>
   <doc>
     <field name="id">mother</field>
     <field name="type">parent</field>
     <doc>
       <field name="id">daughter</field>
       <field name="type">child</field>
     </doc>
   </doc>
 </add>'

# Query for mother’s children using either child transformer or child query parser
curl "localhost:8983/solr/a/query?q=id:mother&fl=%2A%2C%5Bchild%20parentFilter%3Dtype%3Aparent%5D”
{
  "responseHeader":{
    "zkConnected":true,
    "status":0,
    "QTime":4,
    "params":{
      "q":"id:mother",
      "fl":"*,[child parentFilter=type:parent]"}},
  "response":{"numFound":1,"start":0,"docs":[
      {
        "id":"mother",
        "type":["parent"],
        "_version_":1589249812802306048,
        "type_str":["parent"],
        "_childDocuments_":[
        {
          "id":"friend",
          "type":["other"],
          "_version_":1589249812729954304,
          "type_str":["other"]},
        {
          "id":"daughter",
          "type":["child"],
          "_version_":1589249812802306048,
          "type_str":["child"]}]}]
  }}

As you can see, the “friend” got included as a child of “mother”.
If you index the exact same request, putting “friend” after “mother” in the xml,
the query works as expected.

Inspecting the index, everything looks correct, and only “daughter” and “mother” have _root_=mother.
Is there a rule that you should start a new update request for each type of parent/child relationship
that you need to index, and not mix them in the same request?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

Reply | Threaded
Open this post in threaded view
|

Re: Mixing simple and nested docs in same update?

Jan Høydahl / Cominvent
Radio silence…

Here is a GIST for easy reproduction. Is this by design?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 11. jan. 2018 kl. 00:42 skrev Jan Høydahl <[hidden email]>:
>
> Hi,
>
> We index several large nested documents. We found that querying the data behaves differently depending on how the documents are indexed.
>
> To reproduce:
>
> solr start
> solr create -c nested
> # Index one plain document, “friend" and a nested one, “mother” and “daughter”, in same request:
> curl localhost:8983/solr/nested/update -d ‘
> <add>
>   <doc>
>     <field name="id">friend</field>
>     <field name="type">other</field>
>   </doc>
>   <doc>
>     <field name="id">mother</field>
>     <field name="type">parent</field>
>     <doc>
>       <field name="id">daughter</field>
>       <field name="type">child</field>
>     </doc>
>   </doc>
> </add>'
>
> # Query for mother’s children using either child transformer or child query parser
> curl "localhost:8983/solr/a/query?q=id:mother&fl=%2A%2C%5Bchild%20parentFilter%3Dtype%3Aparent%5D”
> {
>  "responseHeader":{
>    "zkConnected":true,
>    "status":0,
>    "QTime":4,
>    "params":{
>      "q":"id:mother",
>      "fl":"*,[child parentFilter=type:parent]"}},
>  "response":{"numFound":1,"start":0,"docs":[
>      {
>        "id":"mother",
>        "type":["parent"],
>        "_version_":1589249812802306048,
>        "type_str":["parent"],
>        "_childDocuments_":[
>        {
>          "id":"friend",
>          "type":["other"],
>          "_version_":1589249812729954304,
>          "type_str":["other"]},
>        {
>          "id":"daughter",
>          "type":["child"],
>          "_version_":1589249812802306048,
>          "type_str":["child"]}]}]
>  }}
>
> As you can see, the “friend” got included as a child of “mother”.
> If you index the exact same request, putting “friend” after “mother” in the xml,
> the query works as expected.
>
> Inspecting the index, everything looks correct, and only “daughter” and “mother” have _root_=mother.
> Is there a rule that you should start a new update request for each type of parent/child relationship
> that you need to index, and not mix them in the same request?
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>