I have 80000 Questions&Answers which indexed using Solr, and a feature file. I'm trying to extract those feature values for each Q&A couple in order to use them for training by algorithm (such as LambdaMart by RankLib library).
The training Algorithm gets as input this format:
<label> qid:<qid> <feature>:<value> ... <feature>:<value> # <info> For example:
The current feature extraction implementation in Solr is oriented to the
Learning To Rank re-ranking capability, it is not built for feature
extraction ( to then train your model).
I am afraid you will need to implement your own system, that does multiple
queries to Solr with the extraction feature enabled and then parse the
results to build your training set.
Do you have query level or query dependant features ?
In case you are lucky enough to just have document level features, you may
end up in a slightly simplified scenario.