Sorry, I thought that you wanted to maintain the true value rather

than the approximated value. I am not entirely sure, but I think the

approximation arises due to rounding and low-precision storage of

these values in the index. You might be able to reverse engineer it by

looking at "Norms," which involve the document length. TBH there has

been a fair amount of change there in recent releases, and I'm not

completely up to speed on what we store, so I'll decline to provide

more misinformation at this point!

On Tue, Jun 2, 2020 at 1:20 PM <

[hidden email]> wrote:

>

> Thank you for your answer, but please could you explain this idea in

> detail as I cannot see how this would help solving my problem?

>

> For example, I got the indexed Wikipedia Article "Alan Smithee" with a

> document length of 756, which also is used when calculating the average

> document length. But if the BM25 score in this article is calculated it

> uses the approximated document length of 728, which returns a different

> result from when the score is calculated with the correct document

> length. So I wonder where this value is calculated and how I might

> change this approximation or at least can get the approximated value, so

> that I can use it for my own calculations.

>

> On 2020-06-02 18:48, Michael Sokolov wrote:

> > You could append an EOF token to every indexed text, and then iterate

> > over Terms to get the positions of those tokens?

> >

> > On Tue, Jun 2, 2020 at 11:50 AM Moritz Staudinger

> > <

[hidden email]> wrote:

> >>

> >> Hello,

> >>

> >> I am not sure if I am at the right place here, but I got a question

> >> about

> >> the approximation my Lucene implementation does.

> >>

> >> I am trying to calculate the same scores Lucenes BM25Similiarity

> >> calculates,

> >> but I found out that Lucene only approximates the length of documents

> >> for

> >> scoring but uses the correct values for the average document length.

> >> Is there a way to turn off these approximations or to get the values,

> >> so

> >> that I can save it for my own calculations?

> >>

> >> For my Implementation I use Lucene 8.4.1 in Combination with Spring

> >> Boot, if

> >> this is necessary.

> >>

> >> Thank you in advance,

> >> Moritz

> >>

> >>

> >> ---------------------------------------------------------------------

> >> To unsubscribe, e-mail:

[hidden email]
> >> For additional commands, e-mail:

[hidden email]
> >>

> >

> > ---------------------------------------------------------------------

> > To unsubscribe, e-mail:

[hidden email]
> > For additional commands, e-mail:

[hidden email]
>

> ---------------------------------------------------------------------

> To unsubscribe, e-mail:

[hidden email]
> For additional commands, e-mail:

[hidden email]
>

---------------------------------------------------------------------

To unsubscribe, e-mail:

[hidden email]
For additional commands, e-mail:

[hidden email]