Stemming terms in SpanQuery

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Stemming terms in SpanQuery

Michael Chan-5
Hi,

I'm trying to build a SpanQuery using word stems. Is parsing each term
with a QueryParser, constructed with an Analyzer giving stemmed
tokenStream, the right approach? It just seems to me that QueryParser is
designed to parse queries, and so my hunch is that there might be a
better way.

Any help will be much appreciated.

Michael


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

OutOfMemoryError while enumerating through reader.terms(fieldName)

Ramana Jelda
Hi,
I am getting OutOfMemoryError , while enumerating through  TermEnum  after
invoking reader.terms(fieldName).

Just to provide you more information, I have almost 10000 unique terms in
field A. I can successfully enumerate around 5000terms but later I am
gettting OutOfMemoryError.

I set jvm max memory as 512MB , Ofcourse my index is bigger than this memory
around 1GB-2GB..
How can I ask lucene to cleanup loaded index and traverse through remaining
terms.. It seems while enumerating memory always grows in steps of some MBs.

Any help would be really appreciable.

Thanks in advance,
Jelda


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: OutOfMemoryError while enumerating through reader.terms(fieldName)

Ramana Jelda
 
Hi,
I just debugged it closely.. Sorry I am getting OutOfMemoryError not because
of reader.terms()
But because of invoking QueryFilter.bits() method for each unique term.
I will try explain u with psuedo code.

 while(term != null){
       if(term.field().equals(name)){
          String termText = term.text();
          keys.addElement(termText);
       }else{
         break;
       }
      if(te.next()){
        term = te.term();
       }else{
      break;
      }
 }

for(Iterator iter = keys.iterator(); iter.hasNext();){
  String termText = (String) iter.next();
 TermQuery termQuery = new TermQuery(new Term(fieldName, termText));
   QueryFilter filter = new QueryFilter(termQuery);
   final BitSet bits;
   bits = filter.bits(ciaoReader.getIndexReader());
   BitSet pr = cache.put(termText, bits);
}
}

Second for loop which gets BitSet using QueryFilter is now throwing
OutOfMemoryError.

Any advise is relly welcome.

Thx,
Jelda



> -----Original Message-----
> From: Ramana Jelda [mailto:[hidden email]]
> Sent: Tuesday, May 02, 2006 12:55 PM
> To: [hidden email]
> Subject: OutOfMemoryError while enumerating through
> reader.terms(fieldName)
>
> Hi,
> I am getting OutOfMemoryError , while enumerating through  
> TermEnum  after invoking reader.terms(fieldName).
>
> Just to provide you more information, I have almost 10000
> unique terms in field A. I can successfully enumerate around
> 5000terms but later I am gettting OutOfMemoryError.
>
> I set jvm max memory as 512MB , Ofcourse my index is bigger
> than this memory around 1GB-2GB..
> How can I ask lucene to cleanup loaded index and traverse
> through remaining terms.. It seems while enumerating memory
> always grows in steps of some MBs.
>
> Any help would be really appreciable.
>
> Thanks in advance,
> Jelda
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

creating indexReader object

trupti mulajkar
In reply to this post by Ramana Jelda
i am trying to create an object of index reader class that reads my index. i
need this to further generate the document and term frequency vectors.
however when i try to print the contents of the documents (doc.get("contents"))
it shows -null .
any suggestions,
if i cant read the contents then i cannot create the other vectors.

any help will be apprecisted

cheers,
trupti mulajkar
MSc Advanced Computer Science



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: creating indexReader object

Hannes Carl Meyer
Hi,

IndexReader has some static methods, e.g.

IndexReader reader = IndexReader.open(new File("/index"));
http://lucene.apache.org/java/docs/api/org/apache/lucene/index/IndexReader.html#open(java.lang.String)

Hannes

trupti mulajkar schrieb:

> i am trying to create an object of index reader class that reads my index. i
> need this to further generate the document and term frequency vectors.
> however when i try to print the contents of the documents (doc.get("contents"))
> it shows -null .
> any suggestions,
> if i cant read the contents then i cannot create the other vectors.
>
> any help will be apprecisted
>
> cheers,
> trupti mulajkar
> MSc Advanced Computer Science
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>  


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: creating indexReader object

trupti mulajkar
thanx hannes,

but i dont think i made my query clear enough.
i have created the index reader object just the way you mentioned it, but after
that when i try to do create the vectors like term frequency and document
frequency using

doc(i).get("Contents");

i get an only NULL

any ideas ?

cheers,
trupti mulajkar
MSc Advanced Computer Science


Quoting Hannes Carl Meyer <[hidden email]>:

> Hi,
>
> IndexReader has some static methods, e.g.
>
> IndexReader reader = IndexReader.open(new File("/index"));
>
http://lucene.apache.org/java/docs/api/org/apache/lucene/index/IndexReader.html#open(java.lang.String)

>
> Hannes
>
> trupti mulajkar schrieb:
> > i am trying to create an object of index reader class that reads my index.
> i
> > need this to further generate the document and term frequency vectors.
> > however when i try to print the contents of the documents
> (doc.get("contents"))
> > it shows -null .
> > any suggestions,
> > if i cant read the contents then i cannot create the other vectors.
> >
> > any help will be apprecisted
> >
> > cheers,
> > trupti mulajkar
> > MSc Advanced Computer Science
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >  
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: creating indexReader object

Satuluri, Venu_Madhav
In reply to this post by trupti mulajkar
Try using luke to see how the document actually is in the index.
http://www.getopt.org/luke/

-Venu

-----Original Message-----
From: trupti mulajkar [mailto:[hidden email]]
Sent: Tuesday, May 02, 2006 7:41 PM
To: [hidden email]
Subject: Re: creating indexReader object


thanx hannes,

but i dont think i made my query clear enough.
i have created the index reader object just the way you mentioned it,
but after
that when i try to do create the vectors like term frequency and
document
frequency using

doc(i).get("Contents");

i get an only NULL

any ideas ?

cheers,
trupti mulajkar
MSc Advanced Computer Science


Quoting Hannes Carl Meyer <[hidden email]>:

> Hi,
>
> IndexReader has some static methods, e.g.
>
> IndexReader reader = IndexReader.open(new File("/index"));
>
http://lucene.apache.org/java/docs/api/org/apache/lucene/index/IndexRead
er.html#open(java.lang.String)
>
> Hannes
>
> trupti mulajkar schrieb:
> > i am trying to create an object of index reader class that reads my
index.
> i
> > need this to further generate the document and term frequency
vectors.

> > however when i try to print the contents of the documents
> (doc.get("contents"))
> > it shows -null .
> > any suggestions,
> > if i cant read the contents then i cannot create the other vectors.
> >
> > any help will be apprecisted
> >
> > cheers,
> > trupti mulajkar
> > MSc Advanced Computer Science
> >
> >
> >
> >
---------------------------------------------------------------------

> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >  
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: creating indexReader object

Karl Wettin-3
In reply to this post by trupti mulajkar

2 maj 2006 kl. 16.11 skrev trupti mulajkar:
>
> doc(i).get("Contents");
>
> i get an only NULL
>
> any ideas ?

Did you index the field with term vector when you added it to the  
document?


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: OutOfMemoryError while enumerating through reader.terms(fieldName)

mark harwood
In reply to this post by Ramana Jelda
>>Any advise is relly welcome.

Don't cache all that data.
You need a minimum of (numUniqueTerms*numDocs)/8 bytes
to hold that info.
Assuming 10,000 unique terms and 1 million docs you'd
need over 1 Gig of RAM.

I suppose the question is what are you trying to
achieve and why can't you use the existing Lucene APIs
instead of caching all those bitsets?

Cheers
Mark


               
___________________________________________________________
Switch an email account to Yahoo! Mail, you could win FIFA World Cup tickets. http://uk.mail.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: creating indexReader object

Frank Kunemann
In reply to this post by trupti mulajkar

Lucene's fields are case sensitive and I think "contents" is written in
lower case by default.

Cheers,
Frank


-----Original Message-----
From: trupti mulajkar [mailto:[hidden email]]
Sent: Tuesday, May 02, 2006 4:11 PM
To: [hidden email]
Subject: Re: creating indexReader object

thanx hannes,

but i dont think i made my query clear enough.
i have created the index reader object just the way you mentioned it, but
after that when i try to do create the vectors like term frequency and
document frequency using

doc(i).get("Contents");

i get an only NULL

any ideas ?

cheers,
trupti mulajkar
MSc Advanced Computer Science


Quoting Hannes Carl Meyer <[hidden email]>:

> Hi,
>
> IndexReader has some static methods, e.g.
>
> IndexReader reader = IndexReader.open(new File("/index"));
>
http://lucene.apache.org/java/docs/api/org/apache/lucene/index/IndexReader.h
tml#open(java.lang.String)
>
> Hannes
>
> trupti mulajkar schrieb:
> > i am trying to create an object of index reader class that reads my
index.

> i
> > need this to further generate the document and term frequency vectors.
> > however when i try to print the contents of the documents
> (doc.get("contents"))
> > it shows -null .
> > any suggestions,
> > if i cant read the contents then i cannot create the other vectors.
> >
> > any help will be apprecisted
> >
> > cheers,
> > trupti mulajkar
> > MSc Advanced Computer Science
> >
> >
> >
> > --------------------------------------------------------------------
> > - To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >  
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: OutOfMemoryError while enumerating through reader.terms(fieldName)

Ramana Jelda
In reply to this post by mark harwood
I am trying to implement category count almost similar to CNET approach.
At the initialization time , I am trying to create all these BitSets and
then trying to and them with user query(with a bitset obtained from
queryfilter containing user query)..

This way my application is performant..Don't u think so?
Actually I need all those bitsets everytime user queries. I can not use
exisiting Lucene filter approach.. Is n't it??

Thx in advance,
Jelda




> -----Original Message-----
> From: mark harwood [mailto:[hidden email]]
> Sent: Tuesday, May 02, 2006 4:19 PM
> To: [hidden email]
> Subject: RE: OutOfMemoryError while enumerating through
> reader.terms(fieldName)
>
> >>Any advise is relly welcome.
>
> Don't cache all that data.
> You need a minimum of (numUniqueTerms*numDocs)/8 bytes to
> hold that info.
> Assuming 10,000 unique terms and 1 million docs you'd need
> over 1 Gig of RAM.
>
> I suppose the question is what are you trying to achieve and
> why can't you use the existing Lucene APIs instead of caching
> all those bitsets?
>
> Cheers
> Mark
>
>
>
> ___________________________________________________________
> Switch an email account to Yahoo! Mail, you could win FIFA
> World Cup tickets. http://uk.mail.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: OutOfMemoryError while enumerating through reader.terms(fieldName)

Ramana Jelda
I just got an idea for category counting instead following this BitSet
approach..


I will maintain and array with  docIds to cateogy_ids as value.

i.e. documents[docId] =category_id

Which is taking for 1 million docs,around each docid=4
bytes,category_id=4bytes = 8MBytes

And then from user query, using HitCollector docids, I will try to calculate
each category count.. I think it is self understandable..

What do u think??
Any advice is really welcome..

Note: Actually, I have around 20000 unique cateogry ids..

Thx,
Jelda

> -----Original Message-----
> From: Ramana Jelda [mailto:[hidden email]]
> Sent: Tuesday, May 02, 2006 4:41 PM
> To: [hidden email]
> Subject: RE: OutOfMemoryError while enumerating through
> reader.terms(fieldName)
>
> I am trying to implement category count almost similar to
> CNET approach.
> At the initialization time , I am trying to create all these
> BitSets and then trying to and them with user query(with a
> bitset obtained from queryfilter containing user query)..
>
> This way my application is performant..Don't u think so?
> Actually I need all those bitsets everytime user queries. I
> can not use exisiting Lucene filter approach.. Is n't it??
>
> Thx in advance,
> Jelda
>
>
>
>
> > -----Original Message-----
> > From: mark harwood [mailto:[hidden email]]
> > Sent: Tuesday, May 02, 2006 4:19 PM
> > To: [hidden email]
> > Subject: RE: OutOfMemoryError while enumerating through
> > reader.terms(fieldName)
> >
> > >>Any advise is relly welcome.
> >
> > Don't cache all that data.
> > You need a minimum of (numUniqueTerms*numDocs)/8 bytes to hold that
> > info.
> > Assuming 10,000 unique terms and 1 million docs you'd need
> over 1 Gig
> > of RAM.
> >
> > I suppose the question is what are you trying to achieve
> and why can't
> > you use the existing Lucene APIs instead of caching all
> those bitsets?
> >
> > Cheers
> > Mark
> >
> >
> >
> > ___________________________________________________________
> > Switch an email account to Yahoo! Mail, you could win FIFA
> World Cup
> > tickets. http://uk.mail.yahoo.com
> >
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: OutOfMemoryError while enumerating through reader.terms(fieldName)

mark harwood
"Category counts" should really be a FAQ entry.

There is no one right solution to prescribe because it
depends on the shape of your data.


For previous discussions/code samples see here:
http://www.mail-archive.com/java-user@.../msg05123.html

and here for more space-efficient representations for
sparse sets
http://www.mail-archive.com/java-user@.../msg02929.html


Cheers
Mark


               
___________________________________________________________
Switch an email account to Yahoo! Mail, you could win FIFA World Cup tickets. http://uk.mail.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: OutOfMemoryError while enumerating through reader.terms(fieldName)

Ramana Jelda
Thx  for ur quick reply.
I will go through it.

Rgds,
Jelda

> -----Original Message-----
> From: mark harwood [mailto:[hidden email]]
> Sent: Tuesday, May 02, 2006 5:03 PM
> To: [hidden email]
> Subject: RE: OutOfMemoryError while enumerating through
> reader.terms(fieldName)
>
> "Category counts" should really be a FAQ entry.
>
> There is no one right solution to prescribe because it
> depends on the shape of your data.
>
>
> For previous discussions/code samples see here:
> http://www.mail-archive.com/java-user@.../msg05123.html
>
> and here for more space-efficient representations for sparse
> sets
> http://www.mail-archive.com/java-user@.../msg02929.html
>
>
> Cheers
> Mark
>
>
>
> ___________________________________________________________
> Switch an email account to Yahoo! Mail, you could win FIFA
> World Cup tickets. http://uk.mail.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: creating indexReader object

trupti mulajkar
In reply to this post by Karl Wettin-3
i have indexed files uisng IndexFiles,
how can i add the field to the document using this.

cheers,
trupti mulajkar
MSc Advanced Computer Science


Quoting karl wettin <[hidden email]>:

>
> 2 maj 2006 kl. 16.11 skrev trupti mulajkar:
> >
> > doc(i).get("contents");
> >
> > i get an only NULL
> >
> > any ideas ?
>
> Did you index the field with term vector when you added it to the  
> document?
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Stemming terms in SpanQuery

Jason Calabrese
In reply to this post by Michael Chan-5
I think the best way to tokening/stem is to use the analyzer directly.  for
example:

TokenStream ts = analyzer.tokenStream(field, new StringReader(text));
                               
Token token = null;
while ((token = ts.next()) != null) {
        Term newTerm = new Term(field, token.termText());
        //do something with the new newTerm
}

If your just stemming then there will be only a single token so you probably
won't need the while loop.

On Monday 01 May 2006 12:37, Michael Chan wrote:

> Hi,
>
> I'm trying to build a SpanQuery using word stems. Is parsing each term
> with a QueryParser, constructed with an Analyzer giving stemmed
> tokenStream, the right approach? It just seems to me that QueryParser is
> designed to parse queries, and so my hunch is that there might be a
> better way.
>
> Any help will be much appreciated.
>
> Michael
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Updating index if there is a database changes

Kiran Joisher
Hi all,

I m working on a project where I will use lucene to make a search engine on
a database. I am new to lucene. I wrote a test program which indexes a table
and searches the same.. but now I m stuck on how to update the index in case
a database change occurs.. I need some help on this topic...like how do I
update the index at run time... can it be done then and there...or do I have
to write some kind of schedular program which "re-builds" the entire index
say once in a day ... which will be more efficient ?

the data will be huge... 4 million records something..

Thanks in advance,
--Kiran




---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Updating index if there is a database changes

chrislusf
My approach is to select documents ordered by updated_date desc
And only process documents newer than the ones already in the index.

Chris Lu
------------------------------------
Full-Text Lucene Search for Any Databases/Applications
http://www.dbsight.net


On 5/3/06, Kiran Joisher <[hidden email]> wrote:

> Hi all,
>
> I m working on a project where I will use lucene to make a search engine on
> a database. I am new to lucene. I wrote a test program which indexes a table
> and searches the same.. but now I m stuck on how to update the index in case
> a database change occurs.. I need some help on this topic...like how do I
> update the index at run time... can it be done then and there...or do I have
> to write some kind of schedular program which "re-builds" the entire index
> say once in a day ... which will be more efficient ?
>
> the data will be huge... 4 million records something..
>
> Thanks in advance,
> --Kiran
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Updating index if there is a database changes

Stephen Gray-2
In reply to this post by Kiran Joisher
Hi Kirin,

Once you've updated an index using IndexWriter or IndexReader you just need
to close and re-open your IndexSearcher so that searching includes the
changes. There is a great library callled LuceIndexAccessor at the link
below that manages this for you. It creates an IndexReader/Writer/Searcher
and hands out references to classes that need to use them. When you want to
read/write you just get a reference to the appropriate object from the
index accessor, then release it when you've finished. If you make changes
using an IndexWriter then release the reference, it waits until all
IndexSearcher refs have been released, then recreates the searcher
automatically.

The link is:
http://www.nabble.com/Fwd%3A-Contribution%3A-LuceneIndexAccessor-t17416.html#a47049

You might also like to look at DelayCloseIndexSearcher at
http://issues.apache.org/jira/browse/LUCENE-445

Regards,
Steve

At 05:31 PM 3/05/2006, you wrote:

>Hi all,
>
>I m working on a project where I will use lucene to make a search engine on
>a database. I am new to lucene. I wrote a test program which indexes a table
>and searches the same.. but now I m stuck on how to update the index in case
>a database change occurs.. I need some help on this topic...like how do I
>update the index at run time... can it be done then and there...or do I have
>to write some kind of schedular program which "re-builds" the entire index
>say once in a day ... which will be more efficient ?
>
>the data will be huge... 4 million records something..
>
>Thanks in advance,
>--Kiran
>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [hidden email]
>For additional commands, e-mail: [hidden email]

Stephen Gray
Archive Research Officer
Australian Social Science Data Archive
18 Balmain Crescent (Building #66)
The Australian National University
Canberra ACT 0200

Phone +61 2 6125 2185
Fax +61 2 6125 0627
Web http://assda.anu.edu.au/
Reply | Threaded
Open this post in threaded view
|

RE: Updating index if there is a database changes

Kiran Joisher
Thanks Stephen,

This was really helpful.

Cheers,
--Kiran


 
-----Original Message-----
From: Stephen Gray [mailto:[hidden email]]
Sent: Thursday, May 04, 2006 4:11 AM
To: [hidden email]
Subject: Re: Updating index if there is a database changes

Hi Kirin,

Once you've updated an index using IndexWriter or IndexReader you just need
to close and re-open your IndexSearcher so that searching includes the
changes. There is a great library callled LuceIndexAccessor at the link
below that manages this for you. It creates an IndexReader/Writer/Searcher
and hands out references to classes that need to use them. When you want to
read/write you just get a reference to the appropriate object from the
index accessor, then release it when you've finished. If you make changes
using an IndexWriter then release the reference, it waits until all
IndexSearcher refs have been released, then recreates the searcher
automatically.

The link is:
http://www.nabble.com/Fwd%3A-Contribution%3A-LuceneIndexAccessor-t17416.html
#a47049

You might also like to look at DelayCloseIndexSearcher at
http://issues.apache.org/jira/browse/LUCENE-445

Regards,
Steve

At 05:31 PM 3/05/2006, you wrote:
>Hi all,
>
>I m working on a project where I will use lucene to make a search engine on
>a database. I am new to lucene. I wrote a test program which indexes a
table
>and searches the same.. but now I m stuck on how to update the index in
case
>a database change occurs.. I need some help on this topic...like how do I
>update the index at run time... can it be done then and there...or do I
have

>to write some kind of schedular program which "re-builds" the entire index
>say once in a day ... which will be more efficient ?
>
>the data will be huge... 4 million records something..
>
>Thanks in advance,
>--Kiran
>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [hidden email]
>For additional commands, e-mail: [hidden email]

Stephen Gray
Archive Research Officer
Australian Social Science Data Archive
18 Balmain Crescent (Building #66)
The Australian National University
Canberra ACT 0200

Phone +61 2 6125 2185
Fax +61 2 6125 0627
Web http://assda.anu.edu.au/

**************************************************************
Scanned by  eScan  Anti-Virus  and  Content Security Software.
Visit http://www.mwti.net for more info on eScan and MailScan.
**************************************************************



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

12