RE: calculate term co-occurrence matrix

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: calculate term co-occurrence matrix

Allison, Timothy B.
I have code as part of LUCENE-5318 that counts terms that cooccur within a window of where your query terms appear.  This makes a really useful query term recommender, and the math is dirt simple.

INPUT
Doc1: quick brown fox jumps over the lazy dog
Doc2: quick green fox leaps over the lazy dog

Query: fox , window size before =2, window size after = 3

OUTPUT:
Quick: 2 (and 2 * idf(quick))
Over: 2
Brown: 1
Green: 1
Jumps: 1
Leaps: 1

The query can be anything that can be transformed into a SpanQuery.

If you want examples or help, just drop a line.

See:
https://github.com/tballison/lucene-addons/tree/master/lucene-5317

Also available on Maven:
https://mvnrepository.com/artifact/org.tallison.lucene/lucene-addons/6.4-0.1 


P.S. Thanks to David Smiley for pointing out this request to me.

-----Original Message-----
From: komal [mailto:[hidden email]]
Sent: Monday, March 20, 2017 2:32 AM
To: [hidden email]
Subject: calculate term co-occurrence matrix

hi all,
i need term co-occurrence matrix code. if anyone have plz share it with me.



--
View this message in context: http://lucene.472066.n3.nabble.com/calculate-term-co-occurrence-matrix-tp4325899.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Loading...