LSH/MinHash

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

LSH/MinHash

Andy Hind-2
Hi All

Following on from https://issues.apache.org/jira/browse/LUCENE-6968 (I know it’s been a while…)
I have a QParser plugin that can generate the appropriate banded queries for Jaccard similarity.

It covers the same functionality that was proposed in the original issue but wrapped up as a query parser.
There are two analysis cases and two query cases.. Hashes generated by tokenisation or those generated by pre-analysis. Queries based on text or provided hash values.

If there is interest, I will create the issue and put up the patch.

Regards

Andy 


Reply | Threaded
Open this post in threaded view
|

Re: LSH/MinHash

Tommaso Teofili
Hi Andy,

It would be very nice if you could do that and I'd be very interested
in reviewing and helping out with the patch.
I have been using that filter for a while with my own query bits; a
full fledged query parser would surely be a very useful contribution.

Regards,
Tommaso
Il giorno lun 15 ott 2018 alle ore 22:38 Andy Hind
<[hidden email]> ha scritto:

>
> Hi All
>
> Following on from https://issues.apache.org/jira/browse/LUCENE-6968 (I know it’s been a while…)
> I have a QParser plugin that can generate the appropriate banded queries for Jaccard similarity.
>
> It covers the same functionality that was proposed in the original issue but wrapped up as a query parser.
> There are two analysis cases and two query cases.. Hashes generated by tokenisation or those generated by pre-analysis. Queries based on text or provided hash values.
>
> If there is interest, I will create the issue and put up the patch.
>
> Regards
>
> Andy
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]