how to query against payload

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

how to query against payload

Fang_Li
Hi,

                I want to use payload to store some kind of object id
which is an arbitrary byte array for better performance. But I do need
some kind of function like searching against payload value.

 

Also when the hits are available, how to get the payload of a specific
term from a document without set the field as stored? Currently I found
the only available interface is IndexReader.termPosition(new Term()).
Looks we need to search again.

I've seen there will be per document payload. When will it be ready?

 

Thanks,

Fang, Li

 

Reply | Threaded
Open this post in threaded view
|

Re: how to query against payload

Grant Ingersoll-2

On Apr 21, 2008, at 5:34 AM, [hidden email] wrote:

> Hi,
>
>                I want to use payload to store some kind of object id
> which is an arbitrary byte array for better performance. But I do need
> some kind of function like searching against payload value.
>

Have a look at the BoostingTermQuery.  If you need more than that, you  
could create some new queries using that as a model.

>
>
> Also when the hits are available, how to get the payload of a specific
> term from a document without set the field as stored? Currently I  
> found
> the only available interface is IndexReader.termPosition(new Term()).
> Looks we need to search again.

https://issues.apache.org/jira/browse/LUCENE-1001.  Note, however,  
that the patch there is not going to work.  If you can help out on it,  
that would be great.


>
>
> I've seen there will be per document payload. When will it be ready?
>
>
>
> Thanks,
>
> Fang, Li
>
>
>

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ







---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: how to query against payload

Fang_Li
Hi Grant,
        Thanks for your help.

BoostingTermQuery uses reader.termPositions(term) to get the term
position. In the Term, we cannot put any payload value to find the
result documents. What I want is

Find out all documents which have a specific payload value in a specific
term. We does not care about the value of the term.

The reason is we don't want to store the binary information as a value
so that we probably can accelerate the query performance by using
payload. I am not sure this is a good reason to do in this way.

Thanks,

Li


-----Original Message-----
From: Grant Ingersoll [mailto:[hidden email]]
Sent: Monday, April 21, 2008 9:57 PM
To: [hidden email]
Subject: Re: how to query against payload


On Apr 21, 2008, at 5:34 AM, [hidden email] wrote:

> Hi,
>
>                I want to use payload to store some kind of object id
> which is an arbitrary byte array for better performance. But I do need
> some kind of function like searching against payload value.
>

Have a look at the BoostingTermQuery.  If you need more than that, you  
could create some new queries using that as a model.

>
>
> Also when the hits are available, how to get the payload of a specific
> term from a document without set the field as stored? Currently I  
> found
> the only available interface is IndexReader.termPosition(new Term()).
> Looks we need to search again.

https://issues.apache.org/jira/browse/LUCENE-1001.  Note, however,  
that the patch there is not going to work.  If you can help out on it,  
that would be great.


>
>
> I've seen there will be per document payload. When will it be ready?
>
>
>
> Thanks,
>
> Fang, Li
>
>
>

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ







---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: how to query against payload

Grant Ingersoll-2
Hmmm, sounds like you need a new Query.   I _think_ it could be  
something as simple as MutliplicativeTermQuery or something like that  
whereby instead of adding the score of the payload callback, you would  
multiple.  That way, if the document with the term does not have the  
payload of interest, then multiply by 0, or wait until you have seen  
all payloads for that document and if any are the right one, then  
return a score of 1, otherwise return 0.  I think this would be  
relatively easy to do using the BoostingTermQuery as a template.

Other thought:  Could you put the "payload" as a regular term at the  
same position and do a no-slop phrase query?  Or a slightly modified  
phrase query that requires both terms to be at the same position?

Just thinking out loud,
Grant

On Apr 22, 2008, at 2:51 AM, [hidden email] wrote:

> Hi Grant,
> Thanks for your help.
>
> BoostingTermQuery uses reader.termPositions(term) to get the term
> position. In the Term, we cannot put any payload value to find the
> result documents. What I want is
>
> Find out all documents which have a specific payload value in a  
> specific
> term. We does not care about the value of the term.
>
> The reason is we don't want to store the binary information as a value
> so that we probably can accelerate the query performance by using
> payload. I am not sure this is a good reason to do in this way.
>
> Thanks,
>
> Li
>
>
> -----Original Message-----
> From: Grant Ingersoll [mailto:[hidden email]]
> Sent: Monday, April 21, 2008 9:57 PM
> To: [hidden email]
> Subject: Re: how to query against payload
>
>
> On Apr 21, 2008, at 5:34 AM, [hidden email] wrote:
>
>> Hi,
>>
>>               I want to use payload to store some kind of object id
>> which is an arbitrary byte array for better performance. But I do  
>> need
>> some kind of function like searching against payload value.
>>
>
> Have a look at the BoostingTermQuery.  If you need more than that, you
> could create some new queries using that as a model.
>
>>
>>
>> Also when the hits are available, how to get the payload of a  
>> specific
>> term from a document without set the field as stored? Currently I
>> found
>> the only available interface is IndexReader.termPosition(new Term()).
>> Looks we need to search again.
>
> https://issues.apache.org/jira/browse/LUCENE-1001.  Note, however,
> that the patch there is not going to work.  If you can help out on it,
> that would be great.
>
>
>>
>>
>> I've seen there will be per document payload. When will it be ready?
>>
>>
>>
>> Thanks,
>>
>> Fang, Li
>>
>>
>>
>
> --------------------------
> Grant Ingersoll
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ







---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]