ways to check if document is in a huge search result set

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

ways to check if document is in a huge search result set

Derek Poh
Hi

I have a collection of productdocument.
Each productdocument has supplier information in it.

I need to check if a supplier's products is return in a search
resultcontaining over 100,000 products and in which page (assuming
pagination is 20 products per page).
Itis time-consuming and "labour-intensive" to go through each page to
look for the product of the supplier.

Would like to know if you guys have any better and easier waysto do this?

Derek

----------------------
CONFIDENTIALITY NOTICE

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part.

This e-mail and any reply to it may be monitored for security, legal, regulatory compliance and/or other appropriate reasons.
Reply | Threaded
Open this post in threaded view
|

Re: ways to check if document is in a huge search result set

Michael Kuhlmann-5
Maybe I don't understand your problem, but why don't you just filter by
"supplier information"?

-Michael

Am 11.09.2017 um 04:12 schrieb Derek Poh:

> Hi
>
> I have a collection of productdocument.
> Each productdocument has supplier information in it.
>
> I need to check if a supplier's products is return in a search
> resultcontaining over 100,000 products and in which page (assuming
> pagination is 20 products per page).
> Itis time-consuming and "labour-intensive" to go through each page to
> look for the product of the supplier.
>
> Would like to know if you guys have any better and easier waysto do this?
>
> Derek
>
> ----------------------
> CONFIDENTIALITY NOTICE
> This e-mail (including any attachments) may contain confidential
> and/or privileged information. If you are not the intended recipient
> or have received this e-mail in error, please inform the sender
> immediately and delete this e-mail (including any attachments) from
> your computer, and you must not use, disclose to anyone else or copy
> this e-mail (including any attachments), whether in whole or in part.
> This e-mail and any reply to it may be monitored for security, legal,
> regulatory compliance and/or other appropriate reasons.


Reply | Threaded
Open this post in threaded view
|

Re: ways to check if document is in a huge search result set

Mikhail Khludnev-2
In reply to this post by Derek Poh
You can request facet field, query facet, filter or even explainOther.

On Mon, Sep 11, 2017 at 5:12 AM, Derek Poh <[hidden email]> wrote:

> Hi
>
> I have a collection of productdocument.
> Each productdocument has supplier information in it.
>
> I need to check if a supplier's products is return in a search
> resultcontaining over 100,000 products and in which page (assuming
> pagination is 20 products per page).
> Itis time-consuming and "labour-intensive" to go through each page to look
> for the product of the supplier.
>
> Would like to know if you guys have any better and easier waysto do this?
>
> Derek
>
> ----------------------
> CONFIDENTIALITY NOTICE
> This e-mail (including any attachments) may contain confidential and/or
> privileged information. If you are not the intended recipient or have
> received this e-mail in error, please inform the sender immediately and
> delete this e-mail (including any attachments) from your computer, and you
> must not use, disclose to anyone else or copy this e-mail (including any
> attachments), whether in whole or in part.
> This e-mail and any reply to it may be monitored for security, legal,
> regulatory compliance and/or other appropriate reasons.




--
Sincerely yours
Mikhail Khludnev
Reply | Threaded
Open this post in threaded view
|

Re: ways to check if document is in a huge search result set

Derek Poh
Some additional information.

I have a query from user that a supplier's product(s) is not in the
search result.
I debugged by adding a fq on the supplier id to the query to verify the
supplier's product is in thesearch result. The products do existin the
search result.
I want to tell user in which page of the search result the supplier's
product appear in. To do this I go through each page of the search
result to find the supplier's product.
It is still fine if the search result has a few hundreds products but it
will be a chore if the result have thousands. In this case there are
more than 100,000 products in the result.

Any advice on easier ways to check which page the supplier's product or
document appear in a search result?

On 9/11/2017 2:44 PM, Mikhail Khludnev wrote:

> You can request facet field, query facet, filter or even explainOther.
>
> On Mon, Sep 11, 2017 at 5:12 AM, Derek Poh <[hidden email]> wrote:
>
>> Hi
>>
>> I have a collection of productdocument.
>> Each productdocument has supplier information in it.
>>
>> I need to check if a supplier's products is return in a search
>> resultcontaining over 100,000 products and in which page (assuming
>> pagination is 20 products per page).
>> Itis time-consuming and "labour-intensive" to go through each page to look
>> for the product of the supplier.
>>
>> Would like to know if you guys have any better and easier waysto do this?
>>
>> Derek
>>
>> ----------------------
>> CONFIDENTIALITY NOTICE
>> This e-mail (including any attachments) may contain confidential and/or
>> privileged information. If you are not the intended recipient or have
>> received this e-mail in error, please inform the sender immediately and
>> delete this e-mail (including any attachments) from your computer, and you
>> must not use, disclose to anyone else or copy this e-mail (including any
>> attachments), whether in whole or in part.
>> This e-mail and any reply to it may be monitored for security, legal,
>> regulatory compliance and/or other appropriate reasons.
>
>
>


----------------------
CONFIDENTIALITY NOTICE

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part.

This e-mail and any reply to it may be monitored for security, legal, regulatory compliance and/or other appropriate reasons.
Reply | Threaded
Open this post in threaded view
|

Re: ways to check if document is in a huge search result set

Michael Kuhlmann-5
So you're looking for a solution to validate the result output.

You have two ways:
1. Assuming you're sorting by the default "score" sort option:
Find the result you're looking for by setting the fq filter clause
accordingly, and add "score" the the fl field list.
Then do the normal unfiltered search, still including "score", and start
with page, let's say, 50,000.
Then continue using binary search depending on the returned score values.

2. Set fl to return only the supplier id, then you'll probably be able
to return several ten-thousand results at once.


But be warned, the result position of these elements can vary with every
single commit, esp. when there're lots of documents with the same score
value.

-Michael


Am 12.09.2017 um 03:21 schrieb Derek Poh:

> Some additional information.
>
> I have a query from user that a supplier's product(s) is not in the
> search result.
> I debugged by adding a fq on the supplier id to the query to verify
> the supplier's product is in thesearch result. The products do existin
> the search result.
> I want to tell user in which page of the search result the supplier's
> product appear in. To do this I go through each page of the search
> result to find the supplier's product.
> It is still fine if the search result has a few hundreds products but
> it will be a chore if the result have thousands. In this case there
> are more than 100,000 products in the result.
>
> Any advice on easier ways to check which page the supplier's product
> or document appear in a search result?
>
> On 9/11/2017 2:44 PM, Mikhail Khludnev wrote:
>> You can request facet field, query facet, filter or even explainOther.
>>
>> On Mon, Sep 11, 2017 at 5:12 AM, Derek Poh <[hidden email]>
>> wrote:
>>
>>> Hi
>>>
>>> I have a collection of productdocument.
>>> Each productdocument has supplier information in it.
>>>
>>> I need to check if a supplier's products is return in a search
>>> resultcontaining over 100,000 products and in which page (assuming
>>> pagination is 20 products per page).
>>> Itis time-consuming and "labour-intensive" to go through each page
>>> to look
>>> for the product of the supplier.
>>>
>>> Would like to know if you guys have any better and easier waysto do
>>> this?
>>>
>>> Derek
>>>
>>> ----------------------
>>> CONFIDENTIALITY NOTICE
>>> This e-mail (including any attachments) may contain confidential and/or
>>> privileged information. If you are not the intended recipient or have
>>> received this e-mail in error, please inform the sender immediately and
>>> delete this e-mail (including any attachments) from your computer,
>>> and you
>>> must not use, disclose to anyone else or copy this e-mail (including
>>> any
>>> attachments), whether in whole or in part.
>>> This e-mail and any reply to it may be monitored for security, legal,
>>> regulatory compliance and/or other appropriate reasons.
>>
>>
>>
>
>
> ----------------------
> CONFIDENTIALITY NOTICE
> This e-mail (including any attachments) may contain confidential
> and/or privileged information. If you are not the intended recipient
> or have received this e-mail in error, please inform the sender
> immediately and delete this e-mail (including any attachments) from
> your computer, and you must not use, disclose to anyone else or copy
> this e-mail (including any attachments), whether in whole or in part.
> This e-mail and any reply to it may be monitored for security, legal,
> regulatory compliance and/or other appropriate reasons.


Reply | Threaded
Open this post in threaded view
|

Re: ways to check if document is in a huge search result set

Derek Poh
Hi Michael

"Then continue using binary search depending on the returned score values."

May I know what do you mean by using binary search?

On 9/12/2017 3:08 PM, Michael Kuhlmann wrote:

> So you're looking for a solution to validate the result output.
>
> You have two ways:
> 1. Assuming you're sorting by the default "score" sort option:
> Find the result you're looking for by setting the fq filter clause
> accordingly, and add "score" the the fl field list.
> Then do the normal unfiltered search, still including "score", and start
> with page, let's say, 50,000.
> Then continue using binary search depending on the returned score values.
>
> 2. Set fl to return only the supplier id, then you'll probably be able
> to return several ten-thousand results at once.
>
>
> But be warned, the result position of these elements can vary with every
> single commit, esp. when there're lots of documents with the same score
> value.
>
> -Michael
>
>
> Am 12.09.2017 um 03:21 schrieb Derek Poh:
>> Some additional information.
>>
>> I have a query from user that a supplier's product(s) is not in the
>> search result.
>> I debugged by adding a fq on the supplier id to the query to verify
>> the supplier's product is in thesearch result. The products do existin
>> the search result.
>> I want to tell user in which page of the search result the supplier's
>> product appear in. To do this I go through each page of the search
>> result to find the supplier's product.
>> It is still fine if the search result has a few hundreds products but
>> it will be a chore if the result have thousands. In this case there
>> are more than 100,000 products in the result.
>>
>> Any advice on easier ways to check which page the supplier's product
>> or document appear in a search result?
>>
>> On 9/11/2017 2:44 PM, Mikhail Khludnev wrote:
>>> You can request facet field, query facet, filter or even explainOther.
>>>
>>> On Mon, Sep 11, 2017 at 5:12 AM, Derek Poh <[hidden email]>
>>> wrote:
>>>
>>>> Hi
>>>>
>>>> I have a collection of productdocument.
>>>> Each productdocument has supplier information in it.
>>>>
>>>> I need to check if a supplier's products is return in a search
>>>> resultcontaining over 100,000 products and in which page (assuming
>>>> pagination is 20 products per page).
>>>> Itis time-consuming and "labour-intensive" to go through each page
>>>> to look
>>>> for the product of the supplier.
>>>>
>>>> Would like to know if you guys have any better and easier waysto do
>>>> this?
>>>>
>>>> Derek
>>>>
>>>> ----------------------
>>>> CONFIDENTIALITY NOTICE
>>>> This e-mail (including any attachments) may contain confidential and/or
>>>> privileged information. If you are not the intended recipient or have
>>>> received this e-mail in error, please inform the sender immediately and
>>>> delete this e-mail (including any attachments) from your computer,
>>>> and you
>>>> must not use, disclose to anyone else or copy this e-mail (including
>>>> any
>>>> attachments), whether in whole or in part.
>>>> This e-mail and any reply to it may be monitored for security, legal,
>>>> regulatory compliance and/or other appropriate reasons.
>>>
>>>
>>
>> ----------------------
>> CONFIDENTIALITY NOTICE
>> This e-mail (including any attachments) may contain confidential
>> and/or privileged information. If you are not the intended recipient
>> or have received this e-mail in error, please inform the sender
>> immediately and delete this e-mail (including any attachments) from
>> your computer, and you must not use, disclose to anyone else or copy
>> this e-mail (including any attachments), whether in whole or in part.
>> This e-mail and any reply to it may be monitored for security, legal,
>> regulatory compliance and/or other appropriate reasons.
>
>


----------------------
CONFIDENTIALITY NOTICE

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part.

This e-mail and any reply to it may be monitored for security, legal, regulatory compliance and/or other appropriate reasons.
Reply | Threaded
Open this post in threaded view
|

Re: ways to check if document is in a huge search result set

Michael Kuhlmann-5
Am 13.09.2017 um 04:04 schrieb Derek Poh:
> Hi Michael
>
> "Then continue using binary search depending on the returned score
> values."
>
> May I know what do you mean by using binary search?

An example algorithm is in Java method java.util.Arrays::binarySearch.

Or more detailed: https://en.wikipedia.org/wiki/Binary_search_algorithm

Best,
Michael

Reply | Threaded
Open this post in threaded view
|

Re: ways to check if document is in a huge search result set

Derek Poh
I see. Thank you.

On 9/13/2017 2:36 PM, Michael Kuhlmann wrote:

> Am 13.09.2017 um 04:04 schrieb Derek Poh:
>> Hi Michael
>>
>> "Then continue using binary search depending on the returned score
>> values."
>>
>> May I know what do you mean by using binary search?
> An example algorithm is in Java method java.util.Arrays::binarySearch.
>
> Or more detailed: https://en.wikipedia.org/wiki/Binary_search_algorithm
>
> Best,
> Michael
>
>


----------------------
CONFIDENTIALITY NOTICE

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part.

This e-mail and any reply to it may be monitored for security, legal, regulatory compliance and/or other appropriate reasons.