Using Multiple collections with streaming expressions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Using Multiple collections with streaming expressions

uyilmaz
For example the streaming expression significantTerms:

https://lucene.apache.org/solr/guide/8_4/stream-source-reference.html#significantterms


significantTerms(collection1,
                 q="body:Solr",
                 field="author",
                 limit="50",
                 minDocFreq="10",
                 maxDocFreq=".20",
                 minTermLength="5")

Solr supports querying multiple collections at once, but I can’t figure  out how I can do that with streaming expressions.
When I try enclosing them in quotes like:

significantTerms(“collection1, collection2”,
                 q="body:Solr",
                 field="author",
                 limit="50",
                 minDocFreq="10",
                 maxDocFreq=".20",
                 minTermLength="5")

It gives the error: "EXCEPTION":"java.io.IOException: Slices not found for \" collection1, collection2\""
I think Solr thinks quotes as part of the collection names, hence it can’t find slices for it.

When I just use it without quotes:
significantTerms(collection1, collection2,…
It gives the error: "EXCEPTION":"invalid expression significantTerms(collection1, collection2, …

I tried single quotes, escaping the quotation mark but nothing Works…

Any ideas?

Best, ufuk

Windows 10 için Posta ile gönderildi

Reply | Threaded
Open this post in threaded view
|

Re: Using Multiple collections with streaming expressions

Erick Erickson
You need to open multiple streams, one to each collection then combine them. For instance,
open a significantTerms stream to collection1, another to collection2 and wrap both
in a merge stream.

Best,
Erick

> On Nov 9, 2020, at 1:58 PM, ufuk yılmaz <[hidden email]> wrote:
>
> For example the streaming expression significantTerms:
>
> https://lucene.apache.org/solr/guide/8_4/stream-source-reference.html#significantterms
>
>
> significantTerms(collection1,
>                 q="body:Solr",
>                 field="author",
>                 limit="50",
>                 minDocFreq="10",
>                 maxDocFreq=".20",
>                 minTermLength="5")
>
> Solr supports querying multiple collections at once, but I can’t figure  out how I can do that with streaming expressions.
> When I try enclosing them in quotes like:
>
> significantTerms(“collection1, collection2”,
>                 q="body:Solr",
>                 field="author",
>                 limit="50",
>                 minDocFreq="10",
>                 maxDocFreq=".20",
>                 minTermLength="5")
>
> It gives the error: "EXCEPTION":"java.io.IOException: Slices not found for \" collection1, collection2\""
> I think Solr thinks quotes as part of the collection names, hence it can’t find slices for it.
>
> When I just use it without quotes:
> significantTerms(collection1, collection2,…
> It gives the error: "EXCEPTION":"invalid expression significantTerms(collection1, collection2, …
>
> I tried single quotes, escaping the quotation mark but nothing Works…
>
> Any ideas?
>
> Best, ufuk
>
> Windows 10 için Posta ile gönderildi
>

Reply | Threaded
Open this post in threaded view
|

RE: Using Multiple collections with streaming expressions

uyilmaz
Thanks again Erick, that’s a good idea!

Alternatively, I use an alias covering multiple collections in these situations, but there may be too many combinations of collections, so it’s not always suitable.

Merged significantTerms streams will have meaningles scores in tuples I think, it would be comparing apples and oranges, but in this case I’m only interested in getting foreground counts, so it’s another day’s problem

What seemed strange to me was source code for streams appeared to be handling this case.


Sent from Mail for Windows 10

From: Erick Erickson
Sent: 10 November 2020 16:48
To: [hidden email]
Subject: Re: Using Multiple collections with streaming expressions

Y

Reply | Threaded
Open this post in threaded view
|

Re: Using Multiple collections with streaming expressions

Joel Bernstein
The multiple collection syntax has been implemented for only a few stream
sources: search, timeseries, facet and stats. Eventually it will be
implemented for all stream sources.


Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Nov 10, 2020 at 12:32 PM ufuk yılmaz <[hidden email]>
wrote:

> Thanks again Erick, that’s a good idea!
>
> Alternatively, I use an alias covering multiple collections in these
> situations, but there may be too many combinations of collections, so it’s
> not always suitable.
>
> Merged significantTerms streams will have meaningles scores in tuples I
> think, it would be comparing apples and oranges, but in this case I’m only
> interested in getting foreground counts, so it’s another day’s problem
>
> What seemed strange to me was source code for streams appeared to be
> handling this case.
>
>
> Sent from Mail for Windows 10
>
> From: Erick Erickson
> Sent: 10 November 2020 16:48
> To: [hidden email]
> Subject: Re: Using Multiple collections with streaming expressions
>
> Y
>
>
Reply | Threaded
Open this post in threaded view
|

RE: Using Multiple collections with streaming expressions

uyilmaz
Many thanks for the info Joel

--ufuk

Sent from Mail for Windows 10

From: Joel Bernstein
Sent: 12 November 2020 17:00
To: [hidden email]
Subject: Re: Using Multiple collections with streaming expressions

T