How to fetch documents for which field is not defined

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

How to fetch documents for which field is not defined

Rajnish kamboj
Hi
Does Lucene provide any API to fetch documents for which a field is not
defined.

Example
Document1 : field1=value1, field2=value2,field3=value3

Document2 : field1=value4, field2=value4

I want a query to get documents for which field3 is not defined. In example
it should return Document2.

Regards
Rajnish
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How to fetch documents for which field is not defined

Ahmet Arslan
Hi,
Yes, here it is:  q=+*:* -field3:[* TO *]
Ahmet
On Saturday, July 15, 2017, 8:16:00 AM GMT+3, Rajnish kamboj <[hidden email]> wrote:


Hi
Does Lucene provide any API to fetch documents for which a field is not
defined.

Example
Document1 : field1=value1, field2=value2,field3=value3

Document2 : field1=value4, field2=value4

I want a query to get documents for which field3 is not defined. In example
it should return Document2.

Regards
Rajnish
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How to fetch documents for which field is not defined

Rajnish kamboj
Ok, I will check.

On Sat, 15 Jul 2017 at 3:26 PM, Ahmet Arslan <[hidden email]> wrote:

> Hi,
>
> Yes, here it is:  q=+*:* -field3:[* TO *]
>
> Ahmet
>
> On Saturday, July 15, 2017, 8:16:00 AM GMT+3, Rajnish kamboj <
> [hidden email]> wrote:
>
>
> Hi
> Does Lucene provide any API to fetch documents for which a field is not
> defined.
>
> Example
> Document1 : field1=value1, field2=value2,field3=value3
>
> Document2 : field1=value4, field2=value4
>
> I want a query to get documents for which field3 is not defined. In example
> it should return Document2.
>
> Regards
> Rajnish
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How to fetch documents for which field is not defined

Ahmet Arslan
Hi,
As an alternative, function queries can also be used.exists function may be more intuitive.
q={!func}(not(exists(field3))
On Saturday, July 15, 2017, 1:01:04 PM GMT+3, Rajnish kamboj <[hidden email]> wrote:


Ok, I will check.

On Sat, 15 Jul 2017 at 3:26 PM, Ahmet Arslan <[hidden email]> wrote:

> Hi,
>
> Yes, here it is:  q=+*:* -field3:[* TO *]
>
> Ahmet
>
> On Saturday, July 15, 2017, 8:16:00 AM GMT+3, Rajnish kamboj <
> [hidden email]> wrote:
>
>
> Hi
> Does Lucene provide any API to fetch documents for which a field is not
> defined.
>
> Example
> Document1 : field1=value1, field2=value2,field3=value3
>
> Document2 : field1=value4, field2=value4
>
> I want a query to get documents for which field3 is not defined. In example
> it should return Document2.
>
> Regards
> Rajnish
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How to fetch documents for which field is not defined

Rajnish kamboj
Thanks..
Which lucene version supports this and what is the performance of such
queries on large set of documents.



On Sat, 15 Jul 2017 at 3:38 PM, Ahmet Arslan <[hidden email]>
wrote:

> Hi,
> As an alternative, function queries can also be used.exists function may
> be more intuitive.
> q={!func}(not(exists(field3))
> On Saturday, July 15, 2017, 1:01:04 PM GMT+3, Rajnish kamboj <
> [hidden email]> wrote:
>
>
> Ok, I will check.
>
> On Sat, 15 Jul 2017 at 3:26 PM, Ahmet Arslan <[hidden email]> wrote:
>
> > Hi,
> >
> > Yes, here it is:  q=+*:* -field3:[* TO *]
> >
> > Ahmet
> >
> > On Saturday, July 15, 2017, 8:16:00 AM GMT+3, Rajnish kamboj <
> > [hidden email]> wrote:
> >
> >
> > Hi
> > Does Lucene provide any API to fetch documents for which a field is not
> > defined.
> >
> > Example
> > Document1 : field1=value1, field2=value2,field3=value3
> >
> > Document2 : field1=value4, field2=value4
> >
> > I want a query to get documents for which field3 is not defined. In
> example
> > it should return Document2.
> >
> > Regards
> > Rajnish
> >
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How to fetch documents for which field is not defined

Uwe Schindler
In reply to this post by Ahmet Arslan
That is the "Solr" answer. But it is slow like hell.

In Lucene there is a natove query named FieldValueQuery already for this. It requires DocValues enabled for the field.

IMHO, the best and fastest variant (also to Solr users) is to add a separate multivalued string field named 'fieldnames' where you index all field named that have a value. After that you can query on this using the field name. Elasticsearch is doing the field name approach for exists/not exists by default.

Uwe

Am 15. Juli 2017 11:56:16 MESZ schrieb Ahmet Arslan <[hidden email]>:

>Hi,
>Yes, here it is:  q=+*:* -field3:[* TO *]
>Ahmet
>On Saturday, July 15, 2017, 8:16:00 AM GMT+3, Rajnish kamboj
><[hidden email]> wrote:
>
>
>Hi
>Does Lucene provide any API to fetch documents for which a field is not
>defined.
>
>Example
>Document1 : field1=value1, field2=value2,field3=value3
>
>Document2 : field1=value4, field2=value4
>
>I want a query to get documents for which field3 is not defined. In
>example
>it should return Document2.
>
>Regards
>Rajnish

--
Uwe Schindler
Achterdiek 19, 28357 Bremen
https://www.thetaphi.de
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How to fetch documents for which field is not defined

Trejkaz
On Sat, Jul 15, 2017 at 8:12 PM, Uwe Schindler <[hidden email]> wrote:
> That is the "Solr" answer. But it is slow like hell.
>
> In Lucene there is a natove query named FieldValueQuery already for this.
> It requires DocValues enabled for the field.
>
> IMHO, the best and fastest variant (also to Solr users) is to add a separate
> multivalued string field named 'fieldnames' where you index all field named
> that have a value. After that you can query on this using the field name.
> Elasticsearch is doing the field name approach for exists/not exists by default.

The catch is, you usually have to analyse a field to determine whether
it has a value. Apparently Elasticsearch's field existence query does
not do this, so it considers blank text to be a value, which is not
the same as what the user expected when they did the query.

We *were* using FieldValueQuery, but since moving to Lucene 6 we have
stopped using uninverting reader, so that option doesn't cover all
fields, and fields like "content" aren't really practical to put in
DocValues...

The approach to add a fieldnames field works, but is fiddly at
indexing-time, because now you have to use TokenStream for all fields,
so that you can read one token from each field to test whether there
is one before you add the whole document. I guess it's at least easier
to understand how it works at query-time.

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Loading...