readVInt, what is it for?

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

readVInt, what is it for?

blazingwolf7
Hi,

I am fairly new to Lucene and is now currently going through its source code. I am currently trying to determine how Lucene calculate the frequency of a term in each document located.

I encounter a method named readVInt() in IndexInput class. It seems everytime it called this method it will be able to generate the document number and the frequency of the term in each document.

I am wondering how it work and fail to find and information on it on the Internet. Could anyone explain it to me? Thanks
Reply | Threaded
Open this post in threaded view
|

RE: readVInt, what is it for?

Uwe Schindler
A VInt is the way, how integers are stored in the index file in a compressed
and variable length manner.

Read here: http://lucene.apache.org/java/2_3_2/fileformats.html#VInt

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: [hidden email]

> -----Original Message-----
> From: blazingwolf7 [mailto:[hidden email]]
> Sent: Wednesday, July 02, 2008 11:47 AM
> To: [hidden email]
> Subject: readVInt, what is it for?
>
>
> Hi,
>
> I am fairly new to Lucene and is now currently going through its source
> code. I am currently trying to determine how Lucene calculate the
> frequency
> of a term in each document located.
>
> I encounter a method named readVInt() in IndexInput class. It seems
> everytime it called this method it will be able to generate the document
> number and the frequency of the term in each document.
>
> I am wondering how it work and fail to find and information on it on the
> Internet. Could anyone explain it to me? Thanks
> --
> View this message in context: http://www.nabble.com/readVInt%2C-what-is-
> it-for--tp18233802p18233802.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: readVInt, what is it for?

blazingwolf7
Thanks, I am clear now on that. But do anyone know where is the frequency of the term for each document calculated? I mean which class it may be in and which method?
Thanks

Uwe Schindler wrote
A VInt is the way, how integers are stored in the index file in a compressed
and variable length manner.

Read here: http://lucene.apache.org/java/2_3_2/fileformats.html#VInt

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: blazingwolf7 [mailto:blazingwolf7@gmail.com]
> Sent: Wednesday, July 02, 2008 11:47 AM
> To: java-dev@lucene.apache.org
> Subject: readVInt, what is it for?
>
>
> Hi,
>
> I am fairly new to Lucene and is now currently going through its source
> code. I am currently trying to determine how Lucene calculate the
> frequency
> of a term in each document located.
>
> I encounter a method named readVInt() in IndexInput class. It seems
> everytime it called this method it will be able to generate the document
> number and the frequency of the term in each document.
>
> I am wondering how it work and fail to find and information on it on the
> Internet. Could anyone explain it to me? Thanks
> --
> View this message in context: http://www.nabble.com/readVInt%2C-what-is-
> it-for--tp18233802p18233802.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Reply | Threaded
Open this post in threaded view
|

Re: readVInt, what is it for?

Yonik Seeley-2
The frequency is tracked at index time.  It's simply a read at query
time.  See TermDocs.
If you really want to understand more about the code internals of
Lucene, I'd suggest stepping through more example queries with a
debugger.

-Yonik

On Wed, Jul 2, 2008 at 8:49 PM, blazingwolf7 <[hidden email]> wrote:

>
> Thanks, I am clear now on that. But do anyone know where is the frequency of
> the term for each document calculated? I mean which class it may be in and
> which method?
> Thanks
>
>
> Uwe Schindler wrote:
>>
>> A VInt is the way, how integers are stored in the index file in a
>> compressed
>> and variable length manner.
>>
>> Read here: http://lucene.apache.org/java/2_3_2/fileformats.html#VInt
>>
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: [hidden email]
>>
>>> -----Original Message-----
>>> From: blazingwolf7 [mailto:[hidden email]]
>>> Sent: Wednesday, July 02, 2008 11:47 AM
>>> To: [hidden email]
>>> Subject: readVInt, what is it for?
>>>
>>>
>>> Hi,
>>>
>>> I am fairly new to Lucene and is now currently going through its source
>>> code. I am currently trying to determine how Lucene calculate the
>>> frequency
>>> of a term in each document located.
>>>
>>> I encounter a method named readVInt() in IndexInput class. It seems
>>> everytime it called this method it will be able to generate the document
>>> number and the frequency of the term in each document.
>>>
>>> I am wondering how it work and fail to find and information on it on the
>>> Internet. Could anyone explain it to me? Thanks
>>> --
>>> View this message in context: http://www.nabble.com/readVInt%2C-what-is-
>>> it-for--tp18233802p18233802.html
>>> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18249790.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: readVInt, what is it for?

blazingwolf7
Hmmm, I don't think I get it. How is it tracked during index time? I index my file earlier. Later I will open the index and perform a search. Shouldn't the frequency of each term in each document found be calculated at during the searching process?

Yonik Seeley wrote
The frequency is tracked at index time.  It's simply a read at query
time.  See TermDocs.
If you really want to understand more about the code internals of
Lucene, I'd suggest stepping through more example queries with a
debugger.

-Yonik

On Wed, Jul 2, 2008 at 8:49 PM, blazingwolf7 <blazingwolf7@gmail.com> wrote:
>
> Thanks, I am clear now on that. But do anyone know where is the frequency of
> the term for each document calculated? I mean which class it may be in and
> which method?
> Thanks
>
>
> Uwe Schindler wrote:
>>
>> A VInt is the way, how integers are stored in the index file in a
>> compressed
>> and variable length manner.
>>
>> Read here: http://lucene.apache.org/java/2_3_2/fileformats.html#VInt
>>
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: uwe@thetaphi.de
>>
>>> -----Original Message-----
>>> From: blazingwolf7 [mailto:blazingwolf7@gmail.com]
>>> Sent: Wednesday, July 02, 2008 11:47 AM
>>> To: java-dev@lucene.apache.org
>>> Subject: readVInt, what is it for?
>>>
>>>
>>> Hi,
>>>
>>> I am fairly new to Lucene and is now currently going through its source
>>> code. I am currently trying to determine how Lucene calculate the
>>> frequency
>>> of a term in each document located.
>>>
>>> I encounter a method named readVInt() in IndexInput class. It seems
>>> everytime it called this method it will be able to generate the document
>>> number and the frequency of the term in each document.
>>>
>>> I am wondering how it work and fail to find and information on it on the
>>> Internet. Could anyone explain it to me? Thanks
>>> --
>>> View this message in context: http://www.nabble.com/readVInt%2C-what-is-
>>> it-for--tp18233802p18233802.html
>>> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18249790.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Reply | Threaded
Open this post in threaded view
|

Re: readVInt, what is it for?

Yonik Seeley-2
Lucene creates an inverted index and uses it to search.
Frequency is encoded in the .frq files:
http://lucene.apache.org/java/docs/fileformats.html

-Yonik

On Wed, Jul 2, 2008 at 10:04 PM, blazingwolf7 <[hidden email]> wrote:

>
> Hmmm, I don't think I get it. How is it tracked during index time? I index my
> file earlier. Later I will open the index and perform a search. Shouldn't
> the frequency of each term in each document found be calculated at during
> the searching process?
>
>
> Yonik Seeley wrote:
>>
>> The frequency is tracked at index time.  It's simply a read at query
>> time.  See TermDocs.
>> If you really want to understand more about the code internals of
>> Lucene, I'd suggest stepping through more example queries with a
>> debugger.
>>
>> -Yonik
>>
>> On Wed, Jul 2, 2008 at 8:49 PM, blazingwolf7 <[hidden email]>
>> wrote:
>>>
>>> Thanks, I am clear now on that. But do anyone know where is the frequency
>>> of
>>> the term for each document calculated? I mean which class it may be in
>>> and
>>> which method?
>>> Thanks
>>>
>>>
>>> Uwe Schindler wrote:
>>>>
>>>> A VInt is the way, how integers are stored in the index file in a
>>>> compressed
>>>> and variable length manner.
>>>>
>>>> Read here: http://lucene.apache.org/java/2_3_2/fileformats.html#VInt
>>>>
>>>> -----
>>>> Uwe Schindler
>>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>>> http://www.thetaphi.de
>>>> eMail: [hidden email]
>>>>
>>>>> -----Original Message-----
>>>>> From: blazingwolf7 [mailto:[hidden email]]
>>>>> Sent: Wednesday, July 02, 2008 11:47 AM
>>>>> To: [hidden email]
>>>>> Subject: readVInt, what is it for?
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am fairly new to Lucene and is now currently going through its source
>>>>> code. I am currently trying to determine how Lucene calculate the
>>>>> frequency
>>>>> of a term in each document located.
>>>>>
>>>>> I encounter a method named readVInt() in IndexInput class. It seems
>>>>> everytime it called this method it will be able to generate the
>>>>> document
>>>>> number and the frequency of the term in each document.
>>>>>
>>>>> I am wondering how it work and fail to find and information on it on
>>>>> the
>>>>> Internet. Could anyone explain it to me? Thanks
>>>>> --
>>>>> View this message in context:
>>>>> http://www.nabble.com/readVInt%2C-what-is-
>>>>> it-for--tp18233802p18233802.html
>>>>> Sent from the Lucene - Java Developer mailing list archive at
>>>>> Nabble.com.
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [hidden email]
>>>>> For additional commands, e-mail: [hidden email]
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [hidden email]
>>>> For additional commands, e-mail: [hidden email]
>>>>
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18249790.html
>>> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18250434.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: readVInt, what is it for?

Grant Ingersoll-2
In reply to this post by blazingwolf7
I'd suggest starting with a couple of places:
http://lucene.apache.org/java/2_3_2/fileformats.html

and

http://lucene.apache.org/java/2_3_2/scoring.html

and then do as Yonik said and step through the internals, starting  
with a simple TermQuery which leads to the TermScorer.

-Grant


On Jul 2, 2008, at 10:04 PM, blazingwolf7 wrote:

>
> Hmmm, I don't think I get it. How is it tracked during index time? I  
> index my
> file earlier. Later I will open the index and perform a search.  
> Shouldn't
> the frequency of each term in each document found be calculated at  
> during
> the searching process?
>
>
> Yonik Seeley wrote:
>>
>> The frequency is tracked at index time.  It's simply a read at query
>> time.  See TermDocs.
>> If you really want to understand more about the code internals of
>> Lucene, I'd suggest stepping through more example queries with a
>> debugger.
>>
>> -Yonik
>>
>> On Wed, Jul 2, 2008 at 8:49 PM, blazingwolf7 <[hidden email]>
>> wrote:
>>>
>>> Thanks, I am clear now on that. But do anyone know where is the  
>>> frequency
>>> of
>>> the term for each document calculated? I mean which class it may  
>>> be in
>>> and
>>> which method?
>>> Thanks
>>>
>>>
>>> Uwe Schindler wrote:
>>>>
>>>> A VInt is the way, how integers are stored in the index file in a
>>>> compressed
>>>> and variable length manner.
>>>>
>>>> Read here: http://lucene.apache.org/java/2_3_2/ 
>>>> fileformats.html#VInt
>>>>
>>>> -----
>>>> Uwe Schindler
>>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>>> http://www.thetaphi.de
>>>> eMail: [hidden email]
>>>>
>>>>> -----Original Message-----
>>>>> From: blazingwolf7 [mailto:[hidden email]]
>>>>> Sent: Wednesday, July 02, 2008 11:47 AM
>>>>> To: [hidden email]
>>>>> Subject: readVInt, what is it for?
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am fairly new to Lucene and is now currently going through its  
>>>>> source
>>>>> code. I am currently trying to determine how Lucene calculate the
>>>>> frequency
>>>>> of a term in each document located.
>>>>>
>>>>> I encounter a method named readVInt() in IndexInput class. It  
>>>>> seems
>>>>> everytime it called this method it will be able to generate the
>>>>> document
>>>>> number and the frequency of the term in each document.
>>>>>
>>>>> I am wondering how it work and fail to find and information on  
>>>>> it on
>>>>> the
>>>>> Internet. Could anyone explain it to me? Thanks
>>>>> --
>>>>> View this message in context:
>>>>> http://www.nabble.com/readVInt%2C-what-is-
>>>>> it-for--tp18233802p18233802.html
>>>>> Sent from the Lucene - Java Developer mailing list archive at
>>>>> Nabble.com.
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [hidden email]
>>>>> For additional commands, e-mail: [hidden email]
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [hidden email]
>>>> For additional commands, e-mail: [hidden email]
>>>>
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18249790.html
>>> Sent from the Lucene - Java Developer mailing list archive at  
>>> Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18250434.html
> Sent from the Lucene - Java Developer mailing list archive at  
> Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: readVInt, what is it for?

Mukherjee, Prasenjit
The slide16 in the following ppt might be of some help. Let me know if
it helps.

http://docs.google.com/Presentation?docid=dmsxgtg_98dbh529dn

-Prasen

-----Original Message-----
From: Grant Ingersoll [mailto:[hidden email]]
Sent: Thursday, July 03, 2008 8:08 AM
To: [hidden email]
Subject: Re: readVInt, what is it for?

I'd suggest starting with a couple of places:
http://lucene.apache.org/java/2_3_2/fileformats.html

and

http://lucene.apache.org/java/2_3_2/scoring.html

and then do as Yonik said and step through the internals, starting with
a simple TermQuery which leads to the TermScorer.

-Grant


On Jul 2, 2008, at 10:04 PM, blazingwolf7 wrote:

>
> Hmmm, I don't think I get it. How is it tracked during index time? I
> index my file earlier. Later I will open the index and perform a
> search.
> Shouldn't
> the frequency of each term in each document found be calculated at
> during the searching process?
>
>
> Yonik Seeley wrote:
>>
>> The frequency is tracked at index time.  It's simply a read at query
>> time.  See TermDocs.
>> If you really want to understand more about the code internals of
>> Lucene, I'd suggest stepping through more example queries with a
>> debugger.
>>
>> -Yonik
>>
>> On Wed, Jul 2, 2008 at 8:49 PM, blazingwolf7 <[hidden email]>
>> wrote:
>>>
>>> Thanks, I am clear now on that. But do anyone know where is the
>>> frequency of the term for each document calculated? I mean which
>>> class it may be in and which method?
>>> Thanks
>>>
>>>
>>> Uwe Schindler wrote:
>>>>
>>>> A VInt is the way, how integers are stored in the index file in a
>>>> compressed and variable length manner.
>>>>
>>>> Read here: http://lucene.apache.org/java/2_3_2/
>>>> fileformats.html#VInt
>>>>
>>>> -----
>>>> Uwe Schindler
>>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>>>> eMail: [hidden email]
>>>>
>>>>> -----Original Message-----
>>>>> From: blazingwolf7 [mailto:[hidden email]]
>>>>> Sent: Wednesday, July 02, 2008 11:47 AM
>>>>> To: [hidden email]
>>>>> Subject: readVInt, what is it for?
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am fairly new to Lucene and is now currently going through its
>>>>> source code. I am currently trying to determine how Lucene
>>>>> calculate the frequency of a term in each document located.
>>>>>
>>>>> I encounter a method named readVInt() in IndexInput class. It
>>>>> seems everytime it called this method it will be able to generate
>>>>> the document number and the frequency of the term in each
>>>>> document.
>>>>>
>>>>> I am wondering how it work and fail to find and information on it
>>>>> on the Internet. Could anyone explain it to me? Thanks
>>>>> --
>>>>> View this message in context:
>>>>> http://www.nabble.com/readVInt%2C-what-is-
>>>>> it-for--tp18233802p18233802.html
>>>>> Sent from the Lucene - Java Developer mailing list archive at
>>>>> Nabble.com.
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------
>>>>> --- To unsubscribe, e-mail: [hidden email]
>>>>> For additional commands, e-mail: [hidden email]
>>>>
>>>>
>>>>
>>>> -------------------------------------------------------------------
>>>> -- To unsubscribe, e-mail: [hidden email]
>>>> For additional commands, e-mail: [hidden email]
>>>>
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p1824979
>>> 0.html Sent from the Lucene - Java Developer mailing list archive at

>>> Nabble.com.
>>>
>>>
>>> --------------------------------------------------------------------
>>> - To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>>
>
> --
> View this message in context:
> http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18250434.
> html Sent from the Lucene - Java Developer mailing list archive at
> Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: readVInt, what is it for?

blazingwolf7
Thanks for all the help. I understand how it works already. Now I will have to know how to modify the .frq file. Can anyone help  me with this?

Mukherjee, Prasenjit wrote
The slide16 in the following ppt might be of some help. Let me know if
it helps.

http://docs.google.com/Presentation?docid=dmsxgtg_98dbh529dn

-Prasen

-----Original Message-----
From: Grant Ingersoll [mailto:gsingers@apache.org]
Sent: Thursday, July 03, 2008 8:08 AM
To: java-dev@lucene.apache.org
Subject: Re: readVInt, what is it for?

I'd suggest starting with a couple of places:
http://lucene.apache.org/java/2_3_2/fileformats.html

and

http://lucene.apache.org/java/2_3_2/scoring.html

and then do as Yonik said and step through the internals, starting with
a simple TermQuery which leads to the TermScorer.

-Grant


On Jul 2, 2008, at 10:04 PM, blazingwolf7 wrote:

>
> Hmmm, I don't think I get it. How is it tracked during index time? I
> index my file earlier. Later I will open the index and perform a
> search.
> Shouldn't
> the frequency of each term in each document found be calculated at
> during the searching process?
>
>
> Yonik Seeley wrote:
>>
>> The frequency is tracked at index time.  It's simply a read at query
>> time.  See TermDocs.
>> If you really want to understand more about the code internals of
>> Lucene, I'd suggest stepping through more example queries with a
>> debugger.
>>
>> -Yonik
>>
>> On Wed, Jul 2, 2008 at 8:49 PM, blazingwolf7 <blazingwolf7@gmail.com>
>> wrote:
>>>
>>> Thanks, I am clear now on that. But do anyone know where is the
>>> frequency of the term for each document calculated? I mean which
>>> class it may be in and which method?
>>> Thanks
>>>
>>>
>>> Uwe Schindler wrote:
>>>>
>>>> A VInt is the way, how integers are stored in the index file in a
>>>> compressed and variable length manner.
>>>>
>>>> Read here: http://lucene.apache.org/java/2_3_2/
>>>> fileformats.html#VInt
>>>>
>>>> -----
>>>> Uwe Schindler
>>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>>>> eMail: uwe@thetaphi.de
>>>>
>>>>> -----Original Message-----
>>>>> From: blazingwolf7 [mailto:blazingwolf7@gmail.com]
>>>>> Sent: Wednesday, July 02, 2008 11:47 AM
>>>>> To: java-dev@lucene.apache.org
>>>>> Subject: readVInt, what is it for?
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am fairly new to Lucene and is now currently going through its
>>>>> source code. I am currently trying to determine how Lucene
>>>>> calculate the frequency of a term in each document located.
>>>>>
>>>>> I encounter a method named readVInt() in IndexInput class. It
>>>>> seems everytime it called this method it will be able to generate
>>>>> the document number and the frequency of the term in each
>>>>> document.
>>>>>
>>>>> I am wondering how it work and fail to find and information on it
>>>>> on the Internet. Could anyone explain it to me? Thanks
>>>>> --
>>>>> View this message in context:
>>>>> http://www.nabble.com/readVInt%2C-what-is-
>>>>> it-for--tp18233802p18233802.html
>>>>> Sent from the Lucene - Java Developer mailing list archive at
>>>>> Nabble.com.
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------
>>>>> --- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>> -------------------------------------------------------------------
>>>> -- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p1824979
>>> 0.html Sent from the Lucene - Java Developer mailing list archive at

>>> Nabble.com.
>>>
>>>
>>> --------------------------------------------------------------------
>>> - To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>
>
> --
> View this message in context:
> http://www.nabble.com/readVInt%2C-what-is-it-for--tp18233802p18250434.
> html Sent from the Lucene - Java Developer mailing list archive at
> Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org