Atomic update wrongly deletes child documents

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Atomic update wrongly deletes child documents

Andreas Hubold

when I try to atomically update a single field of a parent/root
document, all of its nested child documents disappear (Solr 8.6.3).

I've tracked the problem down to the reconstruction of the original
document in DistributedUpdateProcessor#getUpdatedDocument. Solr
correctly finds existing nested documents, but skips them because our
schema has a catch-all dynamic field to ignore unkown fields:

<dynamicField name="*" type="ignored" />
<fieldType name="ignored" stored="false" indexed="false"
multiValued="true" class="solr.StrField" />

(We use this to avoid errors about unknown fields from the indexing
application, and I'd like to keep that.)

However, this causes RealTimeGetComponent#toSolrInputDocument to ignore
nested documents (source at [1]). It finds the "ignored" SchemaField as
matching field for the nested document name (loaded from _nest_path_).

I'm looking for a workaround now. Would it be possible to define some
field for the nested documents in the schema? This would be quite a hack
and I don't even know which field type to use. The reference guide does
not recommend such a definition, with good reason [2]:

 > Even though child documents are provided as field values
syntactically and with SolrJ, it’s a matter of syntax and it isn’t an
actual field in the schema. Consequently, the field need not be defined
in the schema and probably shouldn’t be as it would be confusing. There
is no child document field type, at least not yet.

I haven't found anything about this conflict with ignored fields in the
documentation. Maybe it would make sense to treat ignored fields
(non-stored,non-indexed,...) differently in RealTimeGetComponent? Or at
least document this as caveat with some workaround?

Any hints appreciated.