Ranking Question.

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Ranking Question.

shai deljo
Hi,
Maybe a trivial/stupid questions but:
I have a fairly simple schema with a title, tags and description.
I have my own ranking/scoring system that takes into account the
similarity of each tag to a term in the query but now that i want to
include also the title and description (the description is somewhere
between short to a moderate length)  i am not sure how to handle this.
For example, would parsing the description and title before indexing
in SOLR and adding them as tags makes sense ? it sounds like that
would replicate a mechanism of stop words, stemming etc... built into
lucene.
My goal at the end is change as little as possible in the retrieval
process but then be able to rank based the keywords extracted from the
entire document.
Any ideas / directions ?
Thanks
Shai
Reply | Threaded
Open this post in threaded view
|

Re: Ranking Question.

Chris Hostetter-3

you need to elaborate a little more on what yo uare currently doing, and
what you want to be doing... youmention "my own ranking/scoring system"
... is this something you've implemented in code already? Is this a custom
Simalrity class or Query class, or something basic htat you've done with a
custom request hadler?

how do you want matches on the title/description to affect things? should
htey contribute to hte score (ie: influence ordering) or just affect
wether or not a document is included i nthe results set?

when you say "change as little as possible in the retrieval process" are
you refering to some existing process you've implemented, or hte default
logic of the StandardRequestHandler?


: I have a fairly simple schema with a title, tags and description.
: I have my own ranking/scoring system that takes into account the
: similarity of each tag to a term in the query but now that i want to
: include also the title and description (the description is somewhere
: between short to a moderate length)  i am not sure how to handle this.
: For example, would parsing the description and title before indexing
: in SOLR and adding them as tags makes sense ? it sounds like that
: would replicate a mechanism of stop words, stemming etc... built into
: lucene.
: My goal at the end is change as little as possible in the retrieval
: process but then be able to rank based the keywords extracted from the
: entire document.




-Hoss