How to optimise TF and DF computation within Scorer

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

How to optimise TF and DF computation within Scorer

Tanapol Nearunchorn
Hi,

I'm building a custom scorer for LTR module.
Right now, I want to optimise performance of TF and DF computation in
scorer class.
The profiler shown that TermContext.build is a hot path, it looks like to
take some amount of time to seek in a disk to find a statistic about terms.

Here is a sample gist of my LTR Feature implementation:
https://gist.github.com/tanapoln/733f7e783ca3f9dd28702ddc48936f13
I also attached a profiling result screenshot in the gist too.

I'm not sure that caching computation results of TF and DF to reduce disk
seek is a good idea.
Is there anyway to increase performance of computation?

Best regards,
Tanapol