Filter.getDocIdSet() returning null, and what this means for CachingWrapperFilter

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Filter.getDocIdSet() returning null, and what this means for CachingWrapperFilter

Daniel Noll-3-2
Hi all.

We are seeing an exception like this:

java.lang.NullPointerException
    at org.apache.lucene.search.CachingWrapperFilter.docIdSetToCache(CachingWrapperFilter.java:84)
    at org.apache.lucene.search.CachingWrapperFilter.getDocIdSet(CachingWrapperFilter.java:112)
    at com.nuix.storage.search.LazyConstantScoreQuery$LazyFilterWrapper.getDocIdSet(SourceFile:91)
    at org.apache.lucene.search.ConstantScoreQuery$ConstantScorer.<init>(ConstantScoreQuery.java:116)
    at org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:81)
    at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:297)
    at org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:297)
    at org.apache.lucene.search.QueryWrapperFilter$2.iterator(QueryWrapperFilter.java:75)

The class of our own is just an intermediary which delays creating the
Filter object...

        @Override
        public DocIdSet getDocIdSet(IndexReader reader) throws IOException {
            if (delegate == null) {
                delegate = factory.createFilter();
            }
            return delegate.getDocIdSet(reader);
        }

Tracing through the code in CachingWrapperFilter, I can see that this
NPE would occur if getDocIdSet() were to return null.

The Javadoc on Filter says that null will be returned if no documents
will be accepted by the filter, but it doesn't seem that Lucene itself
is handling null return values correctly, so which is correct?  The
code or the Javadoc?  Supposing that null really is OK, does this
cause any problems with how CachingWrapperFilter is implementing the
caching?  I notice it's calling get() and then comparing against null
so it wouldn't appear that it can distinguish "the entry isn't in the
cache" from "the entry is in the cache but it's null".

Daniel



--
Daniel Noll                            Forensic and eDiscovery Software
Senior Developer                              The world's most advanced
Nuix                                                email data analysis
http://nuix.com/                                and eDiscovery software

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Filter.getDocIdSet() returning null, and what this means for CachingWrapperFilter

Uwe Schindler
Can you open an issue, null should be handled like an empty DocIdSet? This seems to be a bug in CachingWrapperFilter.

To go around this, don’t return null, and instead return the constant DocIdSet.EMPTY_DOCIDSET. This is the preferable solution and maybe we will change this in Lucene 4.0, to not allow null as return value.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: [hidden email]

> -----Original Message-----
> From: Daniel Noll [mailto:[hidden email]]
> Sent: Wednesday, May 26, 2010 8:57 AM
> To: Lucene Java Users Mailing List
> Subject: Filter.getDocIdSet() returning null, and what this means for
> CachingWrapperFilter
>
> Hi all.
>
> We are seeing an exception like this:
>
> java.lang.NullPointerException
>     at
> org.apache.lucene.search.CachingWrapperFilter.docIdSetToCache(CachingW
> rapperFilter.java:84)
>     at
> org.apache.lucene.search.CachingWrapperFilter.getDocIdSet(CachingWrapp
> erFilter.java:112)
>     at
> com.nuix.storage.search.LazyConstantScoreQuery$LazyFilterWrapper.getDo
> cIdSet(SourceFile:91)
>     at
> org.apache.lucene.search.ConstantScoreQuery$ConstantScorer.<init>(Const
> antScoreQuery.java:116)
>     at
> org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(Con
> stantScoreQuery.java:81)
>     at
> org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQu
> ery.java:297)
>     at
> org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQu
> ery.java:297)
>     at
> org.apache.lucene.search.QueryWrapperFilter$2.iterator(QueryWrapperFilt
> er.java:75)
>
> The class of our own is just an intermediary which delays creating the Filter
> object...
>
>         @Override
>         public DocIdSet getDocIdSet(IndexReader reader) throws IOException {
>             if (delegate == null) {
>                 delegate = factory.createFilter();
>             }
>             return delegate.getDocIdSet(reader);
>         }
>
> Tracing through the code in CachingWrapperFilter, I can see that this NPE
> would occur if getDocIdSet() were to return null.
>
> The Javadoc on Filter says that null will be returned if no documents will be
> accepted by the filter, but it doesn't seem that Lucene itself is handling null
> return values correctly, so which is correct?  The code or the Javadoc?
> Supposing that null really is OK, does this cause any problems with how
> CachingWrapperFilter is implementing the caching?  I notice it's calling get()
> and then comparing against null so it wouldn't appear that it can distinguish
> "the entry isn't in the cache" from "the entry is in the cache but it's null".
>
> Daniel
>
>
>
> --
> Daniel Noll                            Forensic and eDiscovery Software
> Senior Developer                              The world's most advanced
> Nuix                                                email data analysis
> http://nuix.com/                                and eDiscovery software
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Filter.getDocIdSet() returning null, and what this means for CachingWrapperFilter

Uwe Schindler
I opened https://issues.apache.org/jira/browse/LUCENE-2478 and will fix soon!

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: [hidden email]

> -----Original Message-----
> From: Uwe Schindler [mailto:[hidden email]]
> Sent: Wednesday, May 26, 2010 9:41 AM
> To: [hidden email]
> Subject: RE: Filter.getDocIdSet() returning null, and what this means for
> CachingWrapperFilter
>
> Can you open an issue, null should be handled like an empty DocIdSet? This
> seems to be a bug in CachingWrapperFilter.
>
> To go around this, don’t return null, and instead return the constant
> DocIdSet.EMPTY_DOCIDSET. This is the preferable solution and maybe we
> will change this in Lucene 4.0, to not allow null as return value.
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: [hidden email]
>
> > -----Original Message-----
> > From: Daniel Noll [mailto:[hidden email]]
> > Sent: Wednesday, May 26, 2010 8:57 AM
> > To: Lucene Java Users Mailing List
> > Subject: Filter.getDocIdSet() returning null, and what this means for
> > CachingWrapperFilter
> >
> > Hi all.
> >
> > We are seeing an exception like this:
> >
> > java.lang.NullPointerException
> >     at
> >
> org.apache.lucene.search.CachingWrapperFilter.docIdSetToCache(CachingW
> > rapperFilter.java:84)
> >     at
> >
> org.apache.lucene.search.CachingWrapperFilter.getDocIdSet(CachingWrapp
> > erFilter.java:112)
> >     at
> >
> com.nuix.storage.search.LazyConstantScoreQuery$LazyFilterWrapper.getDo
> > cIdSet(SourceFile:91)
> >     at
> >
> org.apache.lucene.search.ConstantScoreQuery$ConstantScorer.<init>(Cons
> > t
> > antScoreQuery.java:116)
> >     at
> >
> org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(Con
> > stantScoreQuery.java:81)
> >     at
> >
> org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQu
> > ery.java:297)
> >     at
> >
> org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQu
> > ery.java:297)
> >     at
> > org.apache.lucene.search.QueryWrapperFilter$2.iterator(QueryWrapperFil
> > t
> > er.java:75)
> >
> > The class of our own is just an intermediary which delays creating the
> > Filter object...
> >
> >         @Override
> >         public DocIdSet getDocIdSet(IndexReader reader) throws IOException
> {
> >             if (delegate == null) {
> >                 delegate = factory.createFilter();
> >             }
> >             return delegate.getDocIdSet(reader);
> >         }
> >
> > Tracing through the code in CachingWrapperFilter, I can see that this
> > NPE would occur if getDocIdSet() were to return null.
> >
> > The Javadoc on Filter says that null will be returned if no documents
> > will be accepted by the filter, but it doesn't seem that Lucene itself
> > is handling null return values correctly, so which is correct?  The code or the
> Javadoc?
> > Supposing that null really is OK, does this cause any problems with
> > how CachingWrapperFilter is implementing the caching?  I notice it's
> > calling get() and then comparing against null so it wouldn't appear
> > that it can distinguish "the entry isn't in the cache" from "the entry is in the
> cache but it's null".
> >
> > Daniel
> >
> >
> >
> > --
> > Daniel Noll                            Forensic and eDiscovery Software
> > Senior Developer                              The world's most advanced
> > Nuix                                                email data analysis
> > http://nuix.com/                                and eDiscovery software
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Filter.getDocIdSet() returning null, and what this means for CachingWrapperFilter

Daniel Noll-3-2
On Wed, May 26, 2010 at 23:30, Uwe Schindler <[hidden email]> wrote:
> I opened https://issues.apache.org/jira/browse/LUCENE-2478 and will fix soon!

It does sound like a good idea to not permit a null doc ID set, since
there is a convenient constant for the empty result anyway. :-)

And actually, it turns out that we were returning null due to a
legitimate problem, which makes it an even better idea to prohibit
returning null.  I already changed all the other places where we were
"conforming to the API" ( ;-) ) to return EMPTY_DOCIDSET anyway.

This leaves me with an even bigger mystery though, which will probably
result in another post sooner or later.

Daniel


--
Daniel Noll                            Forensic and eDiscovery Software
Senior Developer                              The world's most advanced
Nuix                                                email data analysis
http://nuix.com/                                and eDiscovery software

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]