Ahh I see.

Term vectors are actually an inverted index for a single document, and they

also have the same postings API as the whole index (including

TermsEnum.totalTermFreq), but that method likely always returns -1 for term

vectors because it's not implemented? Maybe Lucene's default codec should

be improved to store this; maybe open an issue?

In the meantime you could make your own codec that does store it.

> Hi Mike,

> Thanks for the answer. I think this returns the total number of

> occurrences of a specified term across all the documents in the corpus

> right?

>

> But I need the total number of terms (including multiple occurrences of

> the same term) in each document of the corpus. Any suggestion?

>

> Thanks!

>> I think you want to use the TermsEnum.totalTermFreq method?

>>> Is there any way to get the total count of terms in the Term Frequency

>>> Vector (tvf)? I need to calculate the Normalized term frequency of each

>>> term in my tvf. I know how to obtain the length of the tvf, but it

>>> doesn't

>>> work since I need to count duplicate occurrences as well.

>>>

