SuRf FST implementation

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

SuRf FST implementation

John Wang-9
The SigMod paper describes a more compact FST implementation looks really interesting:


Was wondering if Lucene's FST implementation used by term dictionary can take advantage of this.

Thanks

-John
Reply | Threaded
Open this post in threaded view
|

Re: SuRf FST implementation

Michael McCandless-2
Thanks for sharing, John; this looks interesting!  Would be nice to build a codec using that reference impl (hmm, except it is C++)

FST in this context stands for "Fast Succinct Trie" not "Finite State Transducer" and it seems to be a data structure similar to (but claimed better than) bloom filters, in that it can efficiently evaluate set membership with a tunable chance for being wrong (false positive), but it'd be nice to see some numbers on a real Lucene index / use case.


On Sat, Jun 23, 2018 at 6:04 PM, John Wang <[hidden email]> wrote:
The SigMod paper describes a more compact FST implementation looks really interesting:


Was wondering if Lucene's FST implementation used by term dictionary can take advantage of this.

Thanks

-John