N-gram

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

N-gram

Rajesh Munavalli
At what point do I add n-grams? Does the order in which I add n-grams
affect exact phrase queries later? My questions are
 
(1) Should I add all the 1-grams followed by 2-grams followed by
3-grams..etc sentence by sentence OR
(2) Add all the 1 grams of entire document first before starting 2-grams
for the entire document?
 
What is the general accepted notion of adding n-grams of a document?
 
thanks,
 
Rajesh
MS
Reply | Threaded
Open this post in threaded view
|

Re: N-gram

MS
Rajesh
I am not sure what your eventual goal is - but it looks like you are using
Lucene is some sort of Natural Language Processing environment - I am doing
something similar - with dotLucene. Possibly the SpanQuery is what you want
that will let you specify the Span - hence 1-gram, 2-gram etc. Email me if
you want samples (C#)
Madhu


On 7/18/05, Rajesh Munavalli <[hidden email]> wrote:

>
> At what point do I add n-grams? Does the order in which I add n-grams
> affect exact phrase queries later? My questions are
>
> (1) Should I add all the 1-grams followed by 2-grams followed by
> 3-grams..etc sentence by sentence OR
> (2) Add all the 1 grams of entire document first before starting 2-grams
> for the entire document?
>
> What is the general accepted notion of adding n-grams of a document?
>
> thanks,
>
> Rajesh
>
>