I am not sure what your eventual goal is - but it looks like you are using
Lucene is some sort of Natural Language Processing environment - I am doing
something similar - with dotLucene. Possibly the SpanQuery is what you want
that will let you specify the Span - hence 1-gram, 2-gram etc. Email me if
you want samples (C#)
> At what point do I add n-grams? Does the order in which I add n-grams
> affect exact phrase queries later? My questions are
> (1) Should I add all the 1-grams followed by 2-grams followed by
> 3-grams..etc sentence by sentence OR
> (2) Add all the 1 grams of entire document first before starting 2-grams
> for the entire document?
> What is the general accepted notion of adding n-grams of a document?