SolrJ ResponseParser Expectations/Compatibility

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

SolrJ ResponseParser Expectations/Compatibility

Jason Gerlowski
Hi all,

I recently spent some time on SOLR-15070 - a SolrJ ClassCastException
that cropped up for a customer hitting /suggest with wt=xml.
SOLR-15070 itself was a relatively straightforward fix, but the more I
think about the underlying cause the more I wonder whether we don't
have a larger, fundamental problem in how our ResponseParser's/codec's
work in SolrJ.  I wanted to run it by everyone here.

My understanding of SolrJ's response serialization/deserialization was
that responses from Solr are supposed to deserialize to the same
NamedList on the client side, regardless of the wire format Solr used
to send the response. This guarantee is seemingly important for other
classes in SolrJ as well as for users who inspect NamedList's directly
- it would be a big burden if (e.g.) SolrJ's QueryResponse had to
handle the slight divergences of responses in 3 or 4 different
formats.

But in working on SOLR-15070 I noticed that ResponseParser's don't
even support the same "collection" types in their deserialization.
BinaryResponseParser (the default) creates structures containing
ArrayLists, SimpleOrderedMaps, NamedLists, and HashMaps.
XMLResponseParser only supports a subset of those: ArrayLists and
SimpleOrderedMaps.  Any API that triggers any of javabin's unique
mappings then will by necessity result in different NamedList
structures on the client side.  The biggest problem in practice here
is probably "Map".

Is my understanding of how NamedList and ResponseParsers are supposed
to work correct?  If so, has anyone else noticed the divergence
between the supported output-types before?  Does anyone have opinions
on a fix?  Presumably this would require bringing our ser/de code for
other formats up to par with javabin, which seems like a big breaking
change for those formats.  Just fishing for general thoughts and
advice I guess.

Best,

Jason

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: SolrJ ResponseParser Expectations/Compatibility

David Smiley
Your understanding is, I think, not fully correct.  Only JavaBin & XML ought to round-trip a NamedList and some other structures, but not JSON which maps them in different ways (see the json.nl param).  Since you discovered this on wt=xml, you found a bug.  Undoubtedly, we need better testing here and consequently hardening to make javabin & XML more consistent.  We can only do so much for JSON.

~ David Smiley
Apache Lucene/Solr Search Developer


On Thu, Jan 7, 2021 at 11:01 AM Jason Gerlowski <[hidden email]> wrote:
Hi all,

I recently spent some time on SOLR-15070 - a SolrJ ClassCastException
that cropped up for a customer hitting /suggest with wt=xml.
SOLR-15070 itself was a relatively straightforward fix, but the more I
think about the underlying cause the more I wonder whether we don't
have a larger, fundamental problem in how our ResponseParser's/codec's
work in SolrJ.  I wanted to run it by everyone here.

My understanding of SolrJ's response serialization/deserialization was
that responses from Solr are supposed to deserialize to the same
NamedList on the client side, regardless of the wire format Solr used
to send the response. This guarantee is seemingly important for other
classes in SolrJ as well as for users who inspect NamedList's directly
- it would be a big burden if (e.g.) SolrJ's QueryResponse had to
handle the slight divergences of responses in 3 or 4 different
formats.

But in working on SOLR-15070 I noticed that ResponseParser's don't
even support the same "collection" types in their deserialization.
BinaryResponseParser (the default) creates structures containing
ArrayLists, SimpleOrderedMaps, NamedLists, and HashMaps.
XMLResponseParser only supports a subset of those: ArrayLists and
SimpleOrderedMaps.  Any API that triggers any of javabin's unique
mappings then will by necessity result in different NamedList
structures on the client side.  The biggest problem in practice here
is probably "Map".

Is my understanding of how NamedList and ResponseParsers are supposed
to work correct?  If so, has anyone else noticed the divergence
between the supported output-types before?  Does anyone have opinions
on a fix?  Presumably this would require bringing our ser/de code for
other formats up to par with javabin, which seems like a big breaking
change for those formats.  Just fishing for general thoughts and
advice I guess.

Best,

Jason

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: SolrJ ResponseParser Expectations/Compatibility

Jason Gerlowski
Ok, that makes sense.  I was aware of "json.nl" and figured it would
mean that lining up the NamedList produced by JSON and other formats
would be "best effort" at most.  This would be good to document in the
"Using SolrJ" ref-guide page.  I'll put that on my todo list.

But it's good to get confirmation that javabin and XML _are_ expected
to 'round-trip' a NamedList.  That implies that it's definitely a
design issue then that our XML ser/de code doesn't round-trip the
"Map" type the way javabin does.

Jason

On Sat, Jan 9, 2021 at 4:40 PM David Smiley <[hidden email]> wrote:

>
> Your understanding is, I think, not fully correct.  Only JavaBin & XML ought to round-trip a NamedList and some other structures, but not JSON which maps them in different ways (see the json.nl param).  Since you discovered this on wt=xml, you found a bug.  Undoubtedly, we need better testing here and consequently hardening to make javabin & XML more consistent.  We can only do so much for JSON.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Thu, Jan 7, 2021 at 11:01 AM Jason Gerlowski <[hidden email]> wrote:
>>
>> Hi all,
>>
>> I recently spent some time on SOLR-15070 - a SolrJ ClassCastException
>> that cropped up for a customer hitting /suggest with wt=xml.
>> SOLR-15070 itself was a relatively straightforward fix, but the more I
>> think about the underlying cause the more I wonder whether we don't
>> have a larger, fundamental problem in how our ResponseParser's/codec's
>> work in SolrJ.  I wanted to run it by everyone here.
>>
>> My understanding of SolrJ's response serialization/deserialization was
>> that responses from Solr are supposed to deserialize to the same
>> NamedList on the client side, regardless of the wire format Solr used
>> to send the response. This guarantee is seemingly important for other
>> classes in SolrJ as well as for users who inspect NamedList's directly
>> - it would be a big burden if (e.g.) SolrJ's QueryResponse had to
>> handle the slight divergences of responses in 3 or 4 different
>> formats.
>>
>> But in working on SOLR-15070 I noticed that ResponseParser's don't
>> even support the same "collection" types in their deserialization.
>> BinaryResponseParser (the default) creates structures containing
>> ArrayLists, SimpleOrderedMaps, NamedLists, and HashMaps.
>> XMLResponseParser only supports a subset of those: ArrayLists and
>> SimpleOrderedMaps.  Any API that triggers any of javabin's unique
>> mappings then will by necessity result in different NamedList
>> structures on the client side.  The biggest problem in practice here
>> is probably "Map".
>>
>> Is my understanding of how NamedList and ResponseParsers are supposed
>> to work correct?  If so, has anyone else noticed the divergence
>> between the supported output-types before?  Does anyone have opinions
>> on a fix?  Presumably this would require bringing our ser/de code for
>> other formats up to par with javabin, which seems like a big breaking
>> change for those formats.  Just fishing for general thoughts and
>> advice I guess.
>>
>> Best,
>>
>> Jason
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]