Ahh I see.

Term vectors are actually an inverted index for a single document, and they

also have the same postings API as the whole index (including

TermsEnum.totalTermFreq), but that method likely always returns -1 for term

vectors because it's not implemented? Maybe Lucene's default codec should

be improved to store this; maybe open an issue?

In the meantime you could make your own codec that does store it.

Mike McCandless

http://blog.mikemccandless.comOn Tue, Apr 18, 2017 at 9:12 AM, Manjula Wijewickrema <

[hidden email]>

wrote:

> Hi Mike,

>

> Thanks for the answer. I think this returns the total number of

> occurrences of a specified term across all the documents in the corpus

> right?

>

> But I need the total number of terms (including multiple occurrences of

> the same term) in each document of the corpus. Any suggestion?

>

> Thanks!

>

> On Tue, Apr 18, 2017 at 2:53 PM, Michael McCandless <

>

[hidden email]> wrote:

>

>> I think you want to use the TermsEnum.totalTermFreq method?

>>

>> Mike McCandless

>>

>>

http://blog.mikemccandless.com>>

>> On Sun, Apr 16, 2017 at 11:36 AM, Manjula Wijewickrema <

>>

[hidden email]> wrote:

>>

>>> Hi,

>>>

>>> Is there any way to get the total count of terms in the Term Frequency

>>> Vector (tvf)? I need to calculate the Normalized term frequency of each

>>> term in my tvf. I know how to obtain the length of the tvf, but it

>>> doesn't

>>> work since I need to count duplicate occurrences as well.

>>>

>>> Highly appreciate your kind response.

>>>

>>

>>

>