Strange fetch streaming expression doesn't fetch fields sometimes?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Strange fetch streaming expression doesn't fetch fields sometimes?

uyilmaz

Hi all,

I have a streaming expression looking like:

fetch(
  myAlias,
  top(
        n=3,
  ....various expressions here
    sort="count(*) desc"
  ),
  fl="username", on="userid=userid", batchSize=3
)

which fails to fetch username field for the 1st result:

{
 "result-set":{
  "docs":[{
    "userid":"123123",
    "count(*)":58}
   ,{
    "userid":"123123123",
    "count(*)":32,
    "username":"Ayha"}
   ,{
    "userid":"12432423321323",
    "count(*)":30,
    "username":"MEHM"}
   ,{
    "EOF":true,
    "RESPONSE_TIME":34889}]}}
       
But strangely, when I change n and batchSize both to 2 and touch nothing else, fetch fetches the first username correctly:

fetch(
  myAlias,
  top(
        n=2,
  ....various expressions here
    sort="count(*) desc"
  ),
  fl="username", on="userid=userid", batchSize=2
)

Result is:

{
 "result-set":{
  "docs":[{
    "userid":"123123",
    "count(*)":58,
    "username":"mura"}
   ,{
    "userid":"123123123",
    "count(*)":32,
    "username":"Ayha"}
   ,{
    "EOF":true,
    "RESPONSE_TIME":34889}]}}
       
What can be the problem?

Regards

~~ufuk

--
uyilmaz <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: Strange fetch streaming expression doesn't fetch fields sometimes?

uyilmaz
I think I found the reason right after asking (facepalm), but it took me days to realize this.

I think fetch performs a naive "in" query, something like:

q="userid:(123123 123123123 12432423321323)&rows={batchSize}"

When userid to document relation is one-to-many, it is possible that above query will result in documents consisting entirely of last two userid's documents, so the first one is left out, resulting in empty username. Docs state that one to many is not supported with fetch, but I didn't stumble onto this issue until recently so I just assumed it would work.

Sorry to take your time, I hope this helps somebody later.

Have a nice day.

On Wed, 14 Oct 2020 00:38:05 +0300
uyilmaz <[hidden email]> wrote:

>
> Hi all,
>
> I have a streaming expression looking like:
>
> fetch(
>   myAlias,
>   top(
> n=3,
>   ....various expressions here
>     sort="count(*) desc"
>   ),
>   fl="username", on="userid=userid", batchSize=3
> )
>
> which fails to fetch username field for the 1st result:
>
> {
>  "result-set":{
>   "docs":[{
>     "userid":"123123",
>     "count(*)":58}
>    ,{
>     "userid":"123123123",
>     "count(*)":32,
>     "username":"Ayha"}
>    ,{
>     "userid":"12432423321323",
>     "count(*)":30,
>     "username":"MEHM"}
>    ,{
>     "EOF":true,
>     "RESPONSE_TIME":34889}]}}
>
> But strangely, when I change n and batchSize both to 2 and touch nothing else, fetch fetches the first username correctly:
>
> fetch(
>   myAlias,
>   top(
> n=2,
>   ....various expressions here
>     sort="count(*) desc"
>   ),
>   fl="username", on="userid=userid", batchSize=2
> )
>
> Result is:
>
> {
>  "result-set":{
>   "docs":[{
>     "userid":"123123",
>     "count(*)":58,
>     "username":"mura"}
>    ,{
>     "userid":"123123123",
>     "count(*)":32,
>     "username":"Ayha"}
>    ,{
>     "EOF":true,
>     "RESPONSE_TIME":34889}]}}
>
> What can be the problem?
>
> Regards
>
> ~~ufuk
>
> --
> uyilmaz <[hidden email]>


--
uyilmaz <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: Strange fetch streaming expression doesn't fetch fields sometimes?

Joel Bernstein
Yes, the docs mention one-to-one and many-to-one fetches, but one-to-many
is not supported currently. I've never really been happy with fetch. It
really needs to be replaced with a standard nested loop join that handles
all scenarios.


Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Oct 13, 2020 at 6:30 PM uyilmaz <[hidden email]> wrote:

> I think I found the reason right after asking (facepalm), but it took me
> days to realize this.
>
> I think fetch performs a naive "in" query, something like:
>
> q="userid:(123123 123123123 12432423321323)&rows={batchSize}"
>
> When userid to document relation is one-to-many, it is possible that above
> query will result in documents consisting entirely of last two userid's
> documents, so the first one is left out, resulting in empty username. Docs
> state that one to many is not supported with fetch, but I didn't stumble
> onto this issue until recently so I just assumed it would work.
>
> Sorry to take your time, I hope this helps somebody later.
>
> Have a nice day.
>
> On Wed, 14 Oct 2020 00:38:05 +0300
> uyilmaz <[hidden email]> wrote:
>
> >
> > Hi all,
> >
> > I have a streaming expression looking like:
> >
> > fetch(
> >   myAlias,
> >   top(
> >       n=3,
> >   ....various expressions here
> >     sort="count(*) desc"
> >   ),
> >   fl="username", on="userid=userid", batchSize=3
> > )
> >
> > which fails to fetch username field for the 1st result:
> >
> > {
> >  "result-set":{
> >   "docs":[{
> >     "userid":"123123",
> >     "count(*)":58}
> >    ,{
> >     "userid":"123123123",
> >     "count(*)":32,
> >     "username":"Ayha"}
> >    ,{
> >     "userid":"12432423321323",
> >     "count(*)":30,
> >     "username":"MEHM"}
> >    ,{
> >     "EOF":true,
> >     "RESPONSE_TIME":34889}]}}
> >
> > But strangely, when I change n and batchSize both to 2 and touch nothing
> else, fetch fetches the first username correctly:
> >
> > fetch(
> >   myAlias,
> >   top(
> >       n=2,
> >   ....various expressions here
> >     sort="count(*) desc"
> >   ),
> >   fl="username", on="userid=userid", batchSize=2
> > )
> >
> > Result is:
> >
> > {
> >  "result-set":{
> >   "docs":[{
> >     "userid":"123123",
> >     "count(*)":58,
> >     "username":"mura"}
> >    ,{
> >     "userid":"123123123",
> >     "count(*)":32,
> >     "username":"Ayha"}
> >    ,{
> >     "EOF":true,
> >     "RESPONSE_TIME":34889}]}}
> >
> > What can be the problem?
> >
> > Regards
> >
> > ~~ufuk
> >
> > --
> > uyilmaz <[hidden email]>
>
>
> --
> uyilmaz <[hidden email]>
>
Reply | Threaded
Open this post in threaded view
|

Re: Strange fetch streaming expression doesn't fetch fields sometimes?

uyilmaz
Is it possible to duplicate its functionality using existing expressions?

In SQL, while grouping you can just say first(column) to get some one-to-many value if you don't care which one you get. Solr usually only has min,max,avg.. aggregation functions. If it had a "first" function I could just get userid and first(username) in an expression, I sometimes use min(username) as a trick while faceting to get extra fields alongside faceted results, but max,min only accepts numbers in streaming expressions.

On Wed, 14 Oct 2020 20:47:28 -0400
Joel Bernstein <[hidden email]> wrote:

> Yes, the docs mention one-to-one and many-to-one fetches, but one-to-many
> is not supported currently. I've never really been happy with fetch. It
> really needs to be replaced with a standard nested loop join that handles
> all scenarios.
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Tue, Oct 13, 2020 at 6:30 PM uyilmaz <[hidden email]> wrote:
>
> > I think I found the reason right after asking (facepalm), but it took me
> > days to realize this.
> >
> > I think fetch performs a naive "in" query, something like:
> >
> > q="userid:(123123 123123123 12432423321323)&rows={batchSize}"
> >
> > When userid to document relation is one-to-many, it is possible that above
> > query will result in documents consisting entirely of last two userid's
> > documents, so the first one is left out, resulting in empty username. Docs
> > state that one to many is not supported with fetch, but I didn't stumble
> > onto this issue until recently so I just assumed it would work.
> >
> > Sorry to take your time, I hope this helps somebody later.
> >
> > Have a nice day.
> >
> > On Wed, 14 Oct 2020 00:38:05 +0300
> > uyilmaz <[hidden email]> wrote:
> >
> > >
> > > Hi all,
> > >
> > > I have a streaming expression looking like:
> > >
> > > fetch(
> > >   myAlias,
> > >   top(
> > >       n=3,
> > >   ....various expressions here
> > >     sort="count(*) desc"
> > >   ),
> > >   fl="username", on="userid=userid", batchSize=3
> > > )
> > >
> > > which fails to fetch username field for the 1st result:
> > >
> > > {
> > >  "result-set":{
> > >   "docs":[{
> > >     "userid":"123123",
> > >     "count(*)":58}
> > >    ,{
> > >     "userid":"123123123",
> > >     "count(*)":32,
> > >     "username":"Ayha"}
> > >    ,{
> > >     "userid":"12432423321323",
> > >     "count(*)":30,
> > >     "username":"MEHM"}
> > >    ,{
> > >     "EOF":true,
> > >     "RESPONSE_TIME":34889}]}}
> > >
> > > But strangely, when I change n and batchSize both to 2 and touch nothing
> > else, fetch fetches the first username correctly:
> > >
> > > fetch(
> > >   myAlias,
> > >   top(
> > >       n=2,
> > >   ....various expressions here
> > >     sort="count(*) desc"
> > >   ),
> > >   fl="username", on="userid=userid", batchSize=2
> > > )
> > >
> > > Result is:
> > >
> > > {
> > >  "result-set":{
> > >   "docs":[{
> > >     "userid":"123123",
> > >     "count(*)":58,
> > >     "username":"mura"}
> > >    ,{
> > >     "userid":"123123123",
> > >     "count(*)":32,
> > >     "username":"Ayha"}
> > >    ,{
> > >     "EOF":true,
> > >     "RESPONSE_TIME":34889}]}}
> > >
> > > What can be the problem?
> > >
> > > Regards
> > >
> > > ~~ufuk
> > >
> > > --
> > > uyilmaz <[hidden email]>
> >
> >
> > --
> > uyilmaz <[hidden email]>
> >


--
uyilmaz <[hidden email]>