Suggestion Needed: Exclude documents that are already served / viewed by a customer

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Suggestion Needed: Exclude documents that are already served / viewed by a customer

Doss
Dear Experts,

For a matchmaking portal, we have one requirement where in, if a customer
viewed complete details of a bride or groom then we have to exclude that
profile id from further search results. Currently, along with other details
we are storing the viewed profile ids in a field (multivalued field)
against that bride or groom's details.

Eg., if A viewed B, then in B's document under the field saw_me we will add
A's id

while searching, lets say, the currently searching members id is 123456
then we will fire a query like

fq=-saw_me:(123456)

Problem #1: The saw_me field value is growing like anything.
Problem #2: Removal of ids which are deleted from the base. Right now we
are doing this job as follows
           Query #1: fq=saw_me:(123456)&fl=DocId //Get all document ids
which has the deleted id as part of saw_me field.
           Query #2: {"DociId":"234567","saw_me":{"remove":"123456"} //loop
through the results got through the 1st query and fire the update query one
by one

We feel that this method of handling is not that optimum, so we need expert
advice. Please guide.
Reply | Threaded
Open this post in threaded view
|

Re: Suggestion Needed: Exclude documents that are already served / viewed by a customer

Jörn Franke
I am not 100% sure if Solr has something out of the box, but you could implement a bloom filter https://en.wikipedia.org/wiki/Bloom_filter and store it in Solr. It is a probabilistic data structure, which is not growing, but can achieve your use case.
However it has a caveat: it can, for example in your case, only say for sure if a person A has NOT visited person B. If you want to know if Person A has visited person B then there might be (with a known probability) false positives.

Nevertheless, it still seems to address your use case as you want to show only not visited profiles.

> Am 06.09.2019 um 07:43 schrieb Doss <[hidden email]>:
>
> Dear Experts,
>
> For a matchmaking portal, we have one requirement where in, if a customer
> viewed complete details of a bride or groom then we have to exclude that
> profile id from further search results. Currently, along with other details
> we are storing the viewed profile ids in a field (multivalued field)
> against that bride or groom's details.
>
> Eg., if A viewed B, then in B's document under the field saw_me we will add
> A's id
>
> while searching, lets say, the currently searching members id is 123456
> then we will fire a query like
>
> fq=-saw_me:(123456)
>
> Problem #1: The saw_me field value is growing like anything.
> Problem #2: Removal of ids which are deleted from the base. Right now we
> are doing this job as follows
>           Query #1: fq=saw_me:(123456)&fl=DocId //Get all document ids
> which has the deleted id as part of saw_me field.
>           Query #2: {"DociId":"234567","saw_me":{"remove":"123456"} //loop
> through the results got through the 1st query and fire the update query one
> by one
>
> We feel that this method of handling is not that optimum, so we need expert
> advice. Please guide.
Reply | Threaded
Open this post in threaded view
|

Re: Suggestion Needed: Exclude documents that are already served / viewed by a customer

Doss
Jorn Thanks for the input, I learned something new today!
https://cwiki.apache.org/confluence/display/solr/BloomIndexComponent this
works per segment level, but our requirement is per document level.

Thanks,
Mohandoss.

On Fri, Sep 6, 2019 at 11:41 AM Jörn Franke <[hidden email]> wrote:

> I am not 100% sure if Solr has something out of the box, but you could
> implement a bloom filter https://en.wikipedia.org/wiki/Bloom_filter and
> store it in Solr. It is a probabilistic data structure, which is not
> growing, but can achieve your use case.
> However it has a caveat: it can, for example in your case, only say for
> sure if a person A has NOT visited person B. If you want to know if Person
> A has visited person B then there might be (with a known probability) false
> positives.
>
> Nevertheless, it still seems to address your use case as you want to show
> only not visited profiles.
>
> > Am 06.09.2019 um 07:43 schrieb Doss <[hidden email]>:
> >
> > Dear Experts,
> >
> > For a matchmaking portal, we have one requirement where in, if a customer
> > viewed complete details of a bride or groom then we have to exclude that
> > profile id from further search results. Currently, along with other
> details
> > we are storing the viewed profile ids in a field (multivalued field)
> > against that bride or groom's details.
> >
> > Eg., if A viewed B, then in B's document under the field saw_me we will
> add
> > A's id
> >
> > while searching, lets say, the currently searching members id is 123456
> > then we will fire a query like
> >
> > fq=-saw_me:(123456)
> >
> > Problem #1: The saw_me field value is growing like anything.
> > Problem #2: Removal of ids which are deleted from the base. Right now we
> > are doing this job as follows
> >           Query #1: fq=saw_me:(123456)&fl=DocId //Get all document ids
> > which has the deleted id as part of saw_me field.
> >           Query #2: {"DociId":"234567","saw_me":{"remove":"123456"}
> //loop
> > through the results got through the 1st query and fire the update query
> one
> > by one
> >
> > We feel that this method of handling is not that optimum, so we need
> expert
> > advice. Please guide.
>
Reply | Threaded
Open this post in threaded view
|

Re: Suggestion Needed: Exclude documents that are already served / viewed by a customer

Doss
In reply to this post by Doss
Hi Experts,

We are migrating our entire search platform from SPHINX to SOLR, we wanted
to do this without any flaw so any suggestion would be greatly appreciated.

Thanks!


On Fri, Sep 6, 2019 at 11:13 AM Doss <[hidden email]> wrote:

> Dear Experts,
>
> For a matchmaking portal, we have one requirement where in, if a customer
> viewed complete details of a bride or groom then we have to exclude that
> profile id from further search results. Currently, along with other details
> we are storing the viewed profile ids in a field (multivalued field)
> against that bride or groom's details.
>
> Eg., if A viewed B, then in B's document under the field saw_me we will
> add A's id
>
> while searching, lets say, the currently searching members id is 123456
> then we will fire a query like
>
> fq=-saw_me:(123456)
>
> Problem #1: The saw_me field value is growing like anything.
> Problem #2: Removal of ids which are deleted from the base. Right now we
> are doing this job as follows
>            Query #1: fq=saw_me:(123456)&fl=DocId //Get all document ids
> which has the deleted id as part of saw_me field.
>            Query #2: {"DociId":"234567","saw_me":{"remove":"123456"}
> //loop through the results got through the 1st query and fire the update
> query one by one
>
> We feel that this method of handling is not that optimum, so we need
> expert advice. Please guide.
>