Re: Deadlock with DirectUpdateHandler2

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: Deadlock with DirectUpdateHandler2

Mike Klaas

On 18-Nov-08, at 8:54 AM, Mark Miller wrote:

> Mark Miller wrote:
>> Toby Cole wrote:
>>> Has anyone else experienced a deadlock when the  
>>> DirectUpdateHandler2 does an autocommit?
>>> I'm using a recent snapshot from hudson (apache-
>>> solr-2008-11-12_08-06-21), and quite often when I'm loading data  
>>> the server (tomcat 6) gets stuck at line 469 of  
>>> DirectUpdateHandler2:
>>>
>>>      // Check if there is a commit already scheduled for longer  
>>> then this time
>>>      if( pending != null &&
>>>          pending.getDelay(TimeUnit.MILLISECONDS) >= commitMaxTime )
>>>
>>> Anyone got any enlightening tips?
>>>

>> There is some inconsistent synchronization I think. Especially  
>> involving pending. Yuck <g>
> I would say there are problems with pending, autoCommitCount, and  
> lastAddedTime. That alone could probably cause a deadlock (who  
> knows), but it also seems somewhat possible that there is an issue  
> with the heavy intermingling of locks (there a bunch of locks to be  
> had in that class). I havn't looked for evidence of that though -  
> prob makes sense to fix those 3 guys and see if you get reports from  
> there.


autoCommitCount is written in a CommitTracker.synchronized block  
only.  It is read to print stats in an unsynchronized fashion, which  
perhaps could be fixed, though I can't see how it could cause a problem

lastAddedTime is only written in a call path within a  
DirectUpdateHandler2.synchronized block.  It is only read in a  
CommitTracker.synchronized block.  It could read the wrong value, but  
I also don't see this causing a problem (a commit might fail to be  
scheduled).  This could probably also be improved, but doesn't seem  
important.

pending seems to be the issue.  As long as commit are only triggered  
by autocommit, there is no issue as manipulation of pending is always  
performed inside CommitTracker.synchronized.  But didCommit()/
didRollback() could be called via manual commit, and pending is  
directly manipulated during DUH2.close().  I'm having trouble coming  
up with a plausible deadlock scenario, but this needs to be fixed.  It  
isn't as easy as synchronizing didCommit/didRollback, though--this  
would introduce definite deadlock scenarios.

Mark, is there any chance you could post the thread dump for the  
deadlocked process?  Do you issue manual commits during insertion?

-Mike