Capturing URL params for use within Streaming Expressions

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Capturing URL params for use within Streaming Expressions

Houston Putman
Streaming expressions allow for users to pass in any arbitrary URL params in the search streaming source. I'm looking to add the ability for certain streaming functions, maybe just "search()" but possibly more, to extract the extra URL params passed along in the streaming request.

For example sending a request:
http://localhost:8983/solr/example/stream?expr=search(collection1, q="*:*", fl="id", sort="id")&shards.preference=shards.preference=replica.location:local

would be equivalent to:
http://localhost:8983/solr/example/stream?expr=search(collection1, q="*:*", fl="id", sort="id", shards.preference="shards.preference=replica.location:local")

The beauty of URL params is that they can easily be overridden and checked, for example in a proxy between the user and solr. It is harder to do this with streaming expressions as the proxy would need to parse the expression and know the logic of the functions and sources.

I'm open to discussion on whether the params able to be captured by the streaming function would need to be white-listed or black-listed. My idea is that this would be generically implemented through something like the StreamContext, so that any streaming function that wants to add this functionality is able to do so.

Another option is to add a URL parameter such as &expr.override.search.shards.preference=replica.location:local (expr.override.<function>.<parameter>=<value>). That way it's explicit that the user is trying to send options to the streaming expression, and extraneous URL params aren't accidentally captured when they were included for a different purpose.

Anyways this would really help us for some uses cases, especially the replica routing options used in the example above. Really interested to see opinions on either of these options.

- Houston Putman
Reply | Threaded
Open this post in threaded view
|

Re: Capturing URL params for use within Streaming Expressions

Joel Bernstein
I think it's fine to pass through parameters to the various stream sources. Perhaps we should limit it to a set list of parameters to pass through just so it limits the scope,




On Wed, Oct 16, 2019 at 4:47 PM Houston Putman <[hidden email]> wrote:
Streaming expressions allow for users to pass in any arbitrary URL params in the search streaming source. I'm looking to add the ability for certain streaming functions, maybe just "search()" but possibly more, to extract the extra URL params passed along in the streaming request.

For example sending a request:
http://localhost:8983/solr/example/stream?expr=search(collection1, q="*:*", fl="id", sort="id")&shards.preference=shards.preference=replica.location:local

would be equivalent to:
http://localhost:8983/solr/example/stream?expr=search(collection1, q="*:*", fl="id", sort="id", shards.preference="shards.preference=replica.location:local")

The beauty of URL params is that they can easily be overridden and checked, for example in a proxy between the user and solr. It is harder to do this with streaming expressions as the proxy would need to parse the expression and know the logic of the functions and sources.

I'm open to discussion on whether the params able to be captured by the streaming function would need to be white-listed or black-listed. My idea is that this would be generically implemented through something like the StreamContext, so that any streaming function that wants to add this functionality is able to do so.

Another option is to add a URL parameter such as &expr.override.search.shards.preference=replica.location:local (expr.override.<function>.<parameter>=<value>). That way it's explicit that the user is trying to send options to the streaming expression, and extraneous URL params aren't accidentally captured when they were included for a different purpose.

Anyways this would really help us for some uses cases, especially the replica routing options used in the example above. Really interested to see opinions on either of these options.

- Houston Putman
Reply | Threaded
Open this post in threaded view
|

Re: Capturing URL params for use within Streaming Expressions

Houston Putman
Agreed, I think that each stream source should be able to pick the parameters it wants to accept.

After doing more investigation for the use case I was ultimately wanting to use this for, the "shards.preference" parameter above, I think this branches out into an additional feature beyond just passing URL params to stream sources.

The "TupleStream.getShards()" method should pick replicas according to same logic as is provided by the "HttpShardHandlerFactory.NodePreferenceRulesComparator" class. This is slightly related to SOLR-12217 as they would both require the NodePreferenceRulesComparator logic to be migrated to SolrJ.

I might make a JIRA ticket just for this specific use case, as the necessary additions to "StreamContext" (to add URL param passing functionality) would need to be done as a part of the work.

On Fri, Oct 18, 2019 at 4:43 PM Joel Bernstein <[hidden email]> wrote:
I think it's fine to pass through parameters to the various stream sources. Perhaps we should limit it to a set list of parameters to pass through just so it limits the scope,




On Wed, Oct 16, 2019 at 4:47 PM Houston Putman <[hidden email]> wrote:
Streaming expressions allow for users to pass in any arbitrary URL params in the search streaming source. I'm looking to add the ability for certain streaming functions, maybe just "search()" but possibly more, to extract the extra URL params passed along in the streaming request.

For example sending a request:
http://localhost:8983/solr/example/stream?expr=search(collection1, q="*:*", fl="id", sort="id")&shards.preference=shards.preference=replica.location:local

would be equivalent to:
http://localhost:8983/solr/example/stream?expr=search(collection1, q="*:*", fl="id", sort="id", shards.preference="shards.preference=replica.location:local")

The beauty of URL params is that they can easily be overridden and checked, for example in a proxy between the user and solr. It is harder to do this with streaming expressions as the proxy would need to parse the expression and know the logic of the functions and sources.

I'm open to discussion on whether the params able to be captured by the streaming function would need to be white-listed or black-listed. My idea is that this would be generically implemented through something like the StreamContext, so that any streaming function that wants to add this functionality is able to do so.

Another option is to add a URL parameter such as &expr.override.search.shards.preference=replica.location:local (expr.override.<function>.<parameter>=<value>). That way it's explicit that the user is trying to send options to the streaming expression, and extraneous URL params aren't accidentally captured when they were included for a different purpose.

Anyways this would really help us for some uses cases, especially the replica routing options used in the example above. Really interested to see opinions on either of these options.

- Houston Putman