Rollback

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Rollback

gsingers
I know about commit.  Is it possible to rollback?  Or is this just involve deleting and/or readding the necessary documents
 before committing?

Thanks,
Grant

--
Grant Ingersoll
http://lucene.grantingersoll.com


 __________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com 
Reply | Threaded
Open this post in threaded view
|

Re: Rollback

Yonik Seeley
On 6/9/06, Grant Ingersoll <[hidden email]> wrote:
> I know about commit.  Is it possible to rollback?  Or is this just involve deleting and/or readding the necessary documents
>  before committing?

It's not currently possible with DirectUpdateHander2, as documents are
added directly into the main index.

A long time ago, my first idea at how to do the update handler was to
index to a separate index (FS or RAM) and then merge it with the main
index on a commit.  Then I looked at the code for Lucene's
IndexWriter.addIndexes(), saw that it did an optimize() at the start
and end of the merge, and discounted using it.

An IndexWriter.addIndexes() that doesn't do optimize() is certainly
doable in Lucene.  If we had that, we could perhaps have an alternate
UpdateHander that drops the new inserts/deletes.  deleteByQuery still
presents problems though...

It could also be done in DirectUpdateHandler2 with more careful
tracking of what is added vs deleted, and if rollback is called, drop
the deletions, and delete the additions.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: Rollback

jason rutherglen-2
I am implementing the BDBUpdateHandler such that it would save updates into BDB, and then periodically add them into the Lucene index.  This would make rollback possible.  

----- Original Message ----
From: Yonik Seeley <[hidden email]>
To: [hidden email]
Sent: Friday, June 9, 2006 5:55:46 PM
Subject: Re: Rollback

On 6/9/06, Grant Ingersoll <[hidden email]> wrote:
> I know about commit.  Is it possible to rollback?  Or is this just involve deleting and/or readding the necessary documents
>  before committing?

It's not currently possible with DirectUpdateHander2, as documents are
added directly into the main index.

A long time ago, my first idea at how to do the update handler was to
index to a separate index (FS or RAM) and then merge it with the main
index on a commit.  Then I looked at the code for Lucene's
IndexWriter.addIndexes(), saw that it did an optimize() at the start
and end of the merge, and discounted using it.

An IndexWriter.addIndexes() that doesn't do optimize() is certainly
doable in Lucene.  If we had that, we could perhaps have an alternate
UpdateHander that drops the new inserts/deletes.  deleteByQuery still
presents problems though...

It could also be done in DirectUpdateHandler2 with more careful
tracking of what is added vs deleted, and if rollback is called, drop
the deletions, and delete the additions.

-Yonik




Reply | Threaded
Open this post in threaded view
|

Re: Rollback

Yonik Seeley
On 6/10/06, jason rutherglen <[hidden email]> wrote:
> I am implementing the BDBUpdateHandler such that it would save updates into BDB, and then periodically add them into the Lucene index.  This would make rollback possible.

Interesting!  How do they get added to the lucene index?
What do you think the advantages and disadvantages of BDBUpdateHandler
will be (in what situations will it be desirable)?

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: Rollback

jason rutherglen-2
Mainly for using Solr as a transactional search engine.  This will allow direct updates to documents transactionally.  I think it's best to keep the Lucene index in the FSDirectory, and keep a copy of the documents in BDB.  The Lucene index need not have all of the fields stored.  The commit call, syncs the BDB changes into the Lucene index.  If a deleteByQuery is called and some of the documents it wants to delete are actually not in the BDB, then it should fail.  An update call would use optimistic concurrency.  It may be slightly slower, however, I am making an assumption that this would be used in a, don't want to use the word cluster, but partitioned over several collections of master-slave servers, making any performance degradation negligible.  

I am having issues in what is the best way to serialize the documents on the server side.  It would be nice to have a unified set of classes that handle the client and server XML document parsing.  I have asssembled a fast one for the client side, and it would be great to use that on the server, and then serialize the resulting NamedList data structure into the BDB.  

It's going to take some time.


----- Original Message ----
From: Yonik Seeley <[hidden email]>
To: [hidden email]
Sent: Saturday, June 10, 2006 1:22:23 PM
Subject: Re: Rollback

On 6/10/06, jason rutherglen <[hidden email]> wrote:
> I am implementing the BDBUpdateHandler such that it would save updates into BDB, and then periodically add them into the Lucene index.  This would make rollback possible.

Interesting!  How do they get added to the lucene index?
What do you think the advantages and disadvantages of BDBUpdateHandler
will be (in what situations will it be desirable)?

-Yonik




Reply | Threaded
Open this post in threaded view
|

Re: Rollback

Otis Gospodnetic-2
In reply to this post by jason rutherglen-2
Would using Lucene contrib's DbDirectory not provide the needed rollback functionality? (see: http://www.sleepycat.com/jedocs/GettingStartedGuide/applicationoverview.html#transactionIntro )

Otis

----- Original Message ----
From: jason rutherglen <[hidden email]>
To: [hidden email]
Sent: Saturday, June 10, 2006 4:06:29 PM
Subject: Re: Rollback

I am implementing the BDBUpdateHandler such that it would save updates into BDB, and then periodically add them into the Lucene index.  This would make rollback possible.  

----- Original Message ----
From: Yonik Seeley <[hidden email]>
To: [hidden email]
Sent: Friday, June 9, 2006 5:55:46 PM
Subject: Re: Rollback

On 6/9/06, Grant Ingersoll <[hidden email]> wrote:
> I know about commit.  Is it possible to rollback?  Or is this just involve deleting and/or readding the necessary documents
>  before committing?

It's not currently possible with DirectUpdateHander2, as documents are
added directly into the main index.

A long time ago, my first idea at how to do the update handler was to
index to a separate index (FS or RAM) and then merge it with the main
index on a commit.  Then I looked at the code for Lucene's
IndexWriter.addIndexes(), saw that it did an optimize() at the start
and end of the merge, and discounted using it.

An IndexWriter.addIndexes() that doesn't do optimize() is certainly
doable in Lucene.  If we had that, we could perhaps have an alternate
UpdateHander that drops the new inserts/deletes.  deleteByQuery still
presents problems though...

It could also be done in DirectUpdateHandler2 with more careful
tracking of what is added vs deleted, and if rollback is called, drop
the deletions, and delete the additions.

-Yonik







Reply | Threaded
Open this post in threaded view
|

Re: Rollback

jason rutherglen-2
In reply to this post by gsingers
I thought about using this, however from what I've read, DbDirectory is much slower than FSDirectory.  Also it seems like the internals to map the DocID to the actual segment would be some work.  

----- Original Message ----
From: Otis Gospodnetic <[hidden email]>
To: [hidden email]
Sent: Sunday, June 11, 2006 8:24:14 PM
Subject: Re: Rollback

Would using Lucene contrib's DbDirectory not provide the needed rollback functionality? (see: http://www.sleepycat.com/jedocs/GettingStartedGuide/applicationoverview.html#transactionIntro )

Otis

----- Original Message ----
From: jason rutherglen <[hidden email]>
To: [hidden email]
Sent: Saturday, June 10, 2006 4:06:29 PM
Subject: Re: Rollback

I am implementing the BDBUpdateHandler such that it would save updates into BDB, and then periodically add them into the Lucene index.  This would make rollback possible.  

----- Original Message ----
From: Yonik Seeley <[hidden email]>
To: [hidden email]
Sent: Friday, June 9, 2006 5:55:46 PM
Subject: Re: Rollback

On 6/9/06, Grant Ingersoll <[hidden email]> wrote:
> I know about commit.  Is it possible to rollback?  Or is this just involve deleting and/or readding the necessary documents
>  before committing?

It's not currently possible with DirectUpdateHander2, as documents are
added directly into the main index.

A long time ago, my first idea at how to do the update handler was to
index to a separate index (FS or RAM) and then merge it with the main
index on a commit.  Then I looked at the code for Lucene's
IndexWriter.addIndexes(), saw that it did an optimize() at the start
and end of the merge, and discounted using it.

An IndexWriter.addIndexes() that doesn't do optimize() is certainly
doable in Lucene.  If we had that, we could perhaps have an alternate
UpdateHander that drops the new inserts/deletes.  deleteByQuery still
presents problems though...

It could also be done in DirectUpdateHandler2 with more careful
tracking of what is added vs deleted, and if rollback is called, drop
the deletions, and delete the additions.

-Yonik