understand the queryNorm and the fieldNorm.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

understand the queryNorm and the fieldNorm.

jason-51
Hi,

I have a problem of understanding the queryNorm and fieldNorm.

The following is an example. I try to follow what said in the Javadoc
"Computes the normalization value for a query given the sum of the squared
weights of each of the query terms". But the result is different.

ID:0 C:/PDF2Text/SearchEngine/File/SIG/sigkdd/p374-zhang.pdf|initial rank: 0
0.31900567 = sum of:
  0.03968133 = weight(contents:associ in 920), product of:
    0.60161763 = queryWeight(contents:associ), product of:
      1.326625 = idf(docFreq=830)
      0.45349488 = queryNorm
    0.065957725 = fieldWeight(contents:associ in 920), product of:
      4.2426405 = tf(termFreq(contents:associ)=18)
      1.326625 = idf(docFreq=830)
      0.01171875 = fieldNorm(field=contents, doc=920)
  0.27932435 = weight(contents:rule in 920), product of:
    0.7987842 = queryWeight(contents:rule), product of:
      1.7613963 = idf(docFreq=537)
      0.45349488 = queryNorm
    0.34968686 = fieldWeight(contents:rule in 920), product of:
      16.941074 = tf(termFreq(contents:rule)=287)
      1.7613963 = idf(docFreq=537)
      0.01171875 = fieldNorm(field=contents, doc=920)

regards
jiang xing
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: understand the queryNorm and the fieldNorm.

Yonik Seeley
Hi Jason,
I get the same thing for the queryNorm when I calculate it by hand:
1/((1.7613963**2 + 1.326625**2)**.5)  = 0.45349488111693986

-Yonik

On 2/6/06, jason <[hidden email]> wrote:

> Hi,
>
> I have a problem of understanding the queryNorm and fieldNorm.
>
> The following is an example. I try to follow what said in the Javadoc
> "Computes the normalization value for a query given the sum of the squared
> weights of each of the query terms". But the result is different.
>
> ID:0 C:/PDF2Text/SearchEngine/File/SIG/sigkdd/p374-zhang.pdf|initial rank: 0
> 0.31900567 = sum of:
>   0.03968133 = weight(contents:associ in 920), product of:
>     0.60161763 = queryWeight(contents:associ), product of:
>       1.326625 = idf(docFreq=830)
>       0.45349488 = queryNorm
>     0.065957725 = fieldWeight(contents:associ in 920), product of:
>       4.2426405 = tf(termFreq(contents:associ)=18)
>       1.326625 = idf(docFreq=830)
>       0.01171875 = fieldNorm(field=contents, doc=920)
>   0.27932435 = weight(contents:rule in 920), product of:
>     0.7987842 = queryWeight(contents:rule), product of:
>       1.7613963 = idf(docFreq=537)
>       0.45349488 = queryNorm
>     0.34968686 = fieldWeight(contents:rule in 920), product of:
>       16.941074 = tf(termFreq(contents:rule)=287)
>       1.7613963 = idf(docFreq=537)
>       0.01171875 = fieldNorm(field=contents, doc=920)
>
> regards
> jiang xing
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

time of search for an index with the file .FDT much large

ntnbrn80
Hi,
  I have an index with 2,5 million documents.
A document is formed in this way:
- 15 fields index
- 1 field stored but not indexed, whose value is one string of 500 byte.
A search in average gives back the 3000 document. As 3000 id of documents is given back a lot fastly, the 3000 documents instead demands at least 20sec. for being given back!!!
  I have tried with a string of 5 byte instead that of 500, and the 3000 documents are given back in only 1sec!
  The question seems had to the size of the file .fdt
  How I can make in order to reduce the time in the first case?




             
   
  Ing. Antonio Bruno
  Software Analyst
  http://xoomer.virgilio.it/lnb
  cell: (+39) 3402347684
  T&S S.r.l - Technologies And Solutions
  email T&S: [hidden email]
  email Yahoo: [hidden email]







               
---------------------------------
Yahoo! Mail: gratis 1GB per i messaggi, antispam, antivirus, POP3
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: time of search for an index with the file .FDT much large

Yonik Seeley
20 seconds does seem like a long time to retrieve the stored fields of
the 3000 documents.  However, you should also step back and determine
if you really need to do that, or if there is another way to narrow
the number of documents that need to be read from disk.

-Yonik


On 2/6/06, Antonio Bruno <[hidden email]> wrote:

> Hi,
>   I have an index with 2,5 million documents.
> A document is formed in this way:
> - 15 fields index
> - 1 field stored but not indexed, whose value is one string of 500 byte.
> A search in average gives back the 3000 document. As 3000 id of documents is given back a lot fastly, the 3000 documents instead demands at least 20sec. for being given back!!!
>   I have tried with a string of 5 byte instead that of 500, and the 3000 documents are given back in only 1sec!
>   The question seems had to the size of the file .fdt
>   How I can make in order to reduce the time in the first case?
>
>
>
>
>
>
>   Ing. Antonio Bruno
>   Software Analyst
>   http://xoomer.virgilio.it/lnb
>   cell: (+39) 3402347684
>   T&S S.r.l - Technologies And Solutions
>   email T&S: [hidden email]
>   email Yahoo: [hidden email]
>
>
>
>
>
>
>
>
> ---------------------------------
> Yahoo! Mail: gratis 1GB per i messaggi, antispam, antivirus, POP3
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: understand the queryNorm and the fieldNorm.

jason-51
In reply to this post by Yonik Seeley
hi, thx.

I think i forget the ^0.5

cheers
Jason


On 2/6/06, Yonik Seeley <[hidden email]> wrote:

>
> Hi Jason,
> I get the same thing for the queryNorm when I calculate it by hand:
> 1/((1.7613963**2 + 1.326625**2)**.5)  = 0.45349488111693986
>
> -Yonik
>
> On 2/6/06, jason <[hidden email]> wrote:
> > Hi,
> >
> > I have a problem of understanding the queryNorm and fieldNorm.
> >
> > The following is an example. I try to follow what said in the Javadoc
> > "Computes the normalization value for a query given the sum of the
> squared
> > weights of each of the query terms". But the result is different.
> >
> > ID:0 C:/PDF2Text/SearchEngine/File/SIG/sigkdd/p374-zhang.pdf|initialrank: 0
> > 0.31900567 = sum of:
> >   0.03968133 = weight(contents:associ in 920), product of:
> >     0.60161763 = queryWeight(contents:associ), product of:
> >       1.326625 = idf(docFreq=830)
> >       0.45349488 = queryNorm
> >     0.065957725 = fieldWeight(contents:associ in 920), product of:
> >       4.2426405 = tf(termFreq(contents:associ)=18)
> >       1.326625 = idf(docFreq=830)
> >       0.01171875 = fieldNorm(field=contents, doc=920)
> >   0.27932435 = weight(contents:rule in 920), product of:
> >     0.7987842 = queryWeight(contents:rule), product of:
> >       1.7613963 = idf(docFreq=537)
> >       0.45349488 = queryNorm
> >     0.34968686 = fieldWeight(contents:rule in 920), product of:
> >       16.941074 = tf(termFreq(contents:rule)=287)
> >       1.7613963 = idf(docFreq=537)
> >       0.01171875 = fieldNorm(field=contents, doc=920)
> >
> > regards
> > jiang xing
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Loading...