how to find terms on a page?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

how to find terms on a page?

ristretto
Hello,

I haven't heard of or found a way to find the number of times a term
is found on a page.
Lucene uses it in scoring, I believe, (solr scoring:  http://tinyurl.com/4tb55r)

Basically, for a given page, I would like
a list of terms on the page and number of times the terms appear on the page?

thanks
gene
Reply | Threaded
Open this post in threaded view
|

Re: how to find terms on a page?

hossman

: I haven't heard of or found a way to find the number of times a term
: is found on a page.
: Lucene uses it in scoring, I believe, (solr scoring:  http://tinyurl.com/4tb55r)

Assuming by "page" you mean "document" then the "term frequency" (tf) is
factored into the score, but at a low enough level that it's not carried
along iwth the score during a normal search.

: Basically, for a given page, I would like
: a list of terms on the page and number of times the terms appear on the page?

work is currently being done however to make it possible for people to
fetch some of the raw tf/idf info directly...

https://issues.apache.org/jira/browse/SOLR-651


-Hoss

Reply | Threaded
Open this post in threaded view
|

Re: how to find terms on a page?

Gene Campbell-4
That's excellent.  Thanks for the reply.

gene


On Tue, Sep 23, 2008 at 6:39 AM, Chris Hostetter
<[hidden email]> wrote:

>
> : I haven't heard of or found a way to find the number of times a term
> : is found on a page.
> : Lucene uses it in scoring, I believe, (solr scoring:  http://tinyurl.com/4tb55r)
>
> Assuming by "page" you mean "document" then the "term frequency" (tf) is
> factored into the score, but at a low enough level that it's not carried
> along iwth the score during a normal search.
>
> : Basically, for a given page, I would like
> : a list of terms on the page and number of times the terms appear on the page?
>
> work is currently being done however to make it possible for people to
> fetch some of the raw tf/idf info directly...
>
> https://issues.apache.org/jira/browse/SOLR-651
>
>
> -Hoss
>
>