[jira] Created: (NUTCH-151) CommandRunner can hang after the main thread exec is finished and has inefficient busy loop

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (NUTCH-151) CommandRunner can hang after the main thread exec is finished and has inefficient busy loop

JIRA jira@apache.org
CommandRunner can hang after the main thread exec is finished and has inefficient busy loop
-------------------------------------------------------------------------------------------

         Key: NUTCH-151
         URL: http://issues.apache.org/jira/browse/NUTCH-151
     Project: Nutch
        Type: Bug
  Components: indexer  
    Versions: 0.8-dev    
 Environment: all
    Reporter: Paul Baclace


I encountered a case where the JVM of a Tasktracker child did not exit after the main thread returned; a thread dump showed only the threads named STDOUT and STDERR from CommandRunner as non-daemon threads, and both were doing a read().

CommandRunner usually works correctly when the subprocess is expected to be finished before the timeout or when no timeout is used. By _usually_, I mean in the absence of external thread interrupts.  The busy loop that waits for the process to finish has a sleep that is skipped over by an exception; this causes the waiting main thread to compete with the subprocess in a tight loop and effectively reduces the available cpu by 50%.



--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (NUTCH-151) CommandRunner can hang after the main thread exec is finished and has inefficient busy loop

JIRA jira@apache.org
    [ http://issues.apache.org/jira/browse/NUTCH-151?page=comments#action_12361242 ]

Paul Baclace commented on NUTCH-151:
------------------------------------

Analysis:

CommandRunner uses CyclicBarrier is to synchronize the thread that does the
exec (lets call it the main thread) with the io pipe threads.
Before the barrier timeout occurs, the barrier
causes the main thread to wait for all the io pipes to finish, which
they do only when EOF occurs after the subprocess finishes.  Each pipe
sees the EOF and then uses the barrier synchronization.  After all the pipes
await the barrier, they and the main thread continue and everything is fine.
If and only if the barrier timeout occurs, then the main thread uses
Thread.interrupt() to tell each io pipe that it is time to finish up, even
if EOF was not seen and bytes were still being pumped (this was the intention
in the original code).

After the barrier synchronization is finished and did not time out, if
 _waitForExit is true, the main thread
gets the exit value (return code) from the subprocess by calling
Process.exitValue() in a busy loop that throws an exceptions each time
through the loop, which is makes the somewhat expensive busy loop
(test, sleep, repeat) much, much more expensive (test, throw, catch, repeat,
and no sleep!).  For a quick running subprocess, no exception is thrown
because it is finished; the exit value is retrieved and the caller can
get it with getExitValue().  If a timeout occurred, no attempt is made to
obtain the exit value (presumably because it has not exited).


> CommandRunner can hang after the main thread exec is finished and has inefficient busy loop
> -------------------------------------------------------------------------------------------
>
>          Key: NUTCH-151
>          URL: http://issues.apache.org/jira/browse/NUTCH-151
>      Project: Nutch
>         Type: Bug
>   Components: indexer
>     Versions: 0.8-dev
>  Environment: all
>     Reporter: Paul Baclace

>
> I encountered a case where the JVM of a Tasktracker child did not exit after the main thread returned; a thread dump showed only the threads named STDOUT and STDERR from CommandRunner as non-daemon threads, and both were doing a read().
> CommandRunner usually works correctly when the subprocess is expected to be finished before the timeout or when no timeout is used. By _usually_, I mean in the absence of external thread interrupts.  The busy loop that waits for the process to finish has a sleep that is skipped over by an exception; this causes the waiting main thread to compete with the subprocess in a tight loop and effectively reduces the available cpu by 50%.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (NUTCH-151) CommandRunner can hang after the main thread exec is finished and has inefficient busy loop

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
     [ http://issues.apache.org/jira/browse/NUTCH-151?page=all ]

Paul Baclace updated NUTCH-151:
-------------------------------

    Attachment: CommandRunner.java

Minimal required changes to fix bug NUTCH-151:
1. The pipe io threads should be daemons.
2. The main thread should always interrupt() the pipe io threads when finishing up, not just when a timeout occurs.
3. Sleep before testing whether the process has finished with Process.exitValue().
4. Increased the sleep time to be 1000msec.

Obvious cleanup hitchhiking along:
5. Remove unused _kaput;
6. Added comments indicating changes to make in order to use JDK 1.5 instead of  EDU.oswego.cs.dl.util.concurrent package.
7. Changed void evaluate() to be a convenience method that uses int exec() which returns the exit code (or -1 if timed out).

An alternative to the busy loop is to use Process.waitFor() and a separate alarm thread can interrupt the main thread to effect a timeout.  The main thread can then interrupt() the io pipe threads and they will receive an InterruptedIOException.  If necessary, the main thread can also close the streams the io pipe threads are reading from in order to force  them out of read().  (Oddly, the JavaDoc for Thread.interrupt() does not  mention InterruptedIOException.)  



> CommandRunner can hang after the main thread exec is finished and has inefficient busy loop
> -------------------------------------------------------------------------------------------
>
>          Key: NUTCH-151
>          URL: http://issues.apache.org/jira/browse/NUTCH-151
>      Project: Nutch
>         Type: Bug
>   Components: indexer
>     Versions: 0.8-dev
>  Environment: all
>     Reporter: Paul Baclace
>  Attachments: CommandRunner.java
>
> I encountered a case where the JVM of a Tasktracker child did not exit after the main thread returned; a thread dump showed only the threads named STDOUT and STDERR from CommandRunner as non-daemon threads, and both were doing a read().
> CommandRunner usually works correctly when the subprocess is expected to be finished before the timeout or when no timeout is used. By _usually_, I mean in the absence of external thread interrupts.  The busy loop that waits for the process to finish has a sleep that is skipped over by an exception; this causes the waiting main thread to compete with the subprocess in a tight loop and effectively reduces the available cpu by 50%.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (NUTCH-151) CommandRunner can hang after the main thread exec is finished and has inefficient busy loop

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
     [ http://issues.apache.org/jira/browse/NUTCH-151?page=all ]

Paul Baclace updated NUTCH-151:
-------------------------------

    Attachment: CommandRunner.java.patch

Here is the patch for CommandRunner (previously, I attached the actual file).


> CommandRunner can hang after the main thread exec is finished and has inefficient busy loop
> -------------------------------------------------------------------------------------------
>
>          Key: NUTCH-151
>          URL: http://issues.apache.org/jira/browse/NUTCH-151
>      Project: Nutch
>         Type: Bug
>   Components: indexer
>     Versions: 0.8-dev
>  Environment: all
>     Reporter: Paul Baclace
>  Attachments: CommandRunner.java, CommandRunner.java.patch
>
> I encountered a case where the JVM of a Tasktracker child did not exit after the main thread returned; a thread dump showed only the threads named STDOUT and STDERR from CommandRunner as non-daemon threads, and both were doing a read().
> CommandRunner usually works correctly when the subprocess is expected to be finished before the timeout or when no timeout is used. By _usually_, I mean in the absence of external thread interrupts.  The busy loop that waits for the process to finish has a sleep that is skipped over by an exception; this causes the waiting main thread to compete with the subprocess in a tight loop and effectively reduces the available cpu by 50%.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply | Threaded
Open this post in threaded view
|

[jira] Resolved: (NUTCH-151) CommandRunner can hang after the main thread exec is finished and has inefficient busy loop

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
     [ http://issues.apache.org/jira/browse/NUTCH-151?page=all ]
     
Doug Cutting resolved NUTCH-151:
--------------------------------

    Fix Version: 0.8-dev
     Resolution: Fixed

I just committed this.  Thanks, Paul!

> CommandRunner can hang after the main thread exec is finished and has inefficient busy loop
> -------------------------------------------------------------------------------------------
>
>          Key: NUTCH-151
>          URL: http://issues.apache.org/jira/browse/NUTCH-151
>      Project: Nutch
>         Type: Bug
>   Components: indexer
>     Versions: 0.8-dev
>  Environment: all
>     Reporter: Paul Baclace
>      Fix For: 0.8-dev
>  Attachments: CommandRunner.java, CommandRunner.java.patch
>
> I encountered a case where the JVM of a Tasktracker child did not exit after the main thread returned; a thread dump showed only the threads named STDOUT and STDERR from CommandRunner as non-daemon threads, and both were doing a read().
> CommandRunner usually works correctly when the subprocess is expected to be finished before the timeout or when no timeout is used. By _usually_, I mean in the absence of external thread interrupts.  The busy loop that waits for the process to finish has a sleep that is skipped over by an exception; this causes the waiting main thread to compete with the subprocess in a tight loop and effectively reduces the available cpu by 50%.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply | Threaded
Open this post in threaded view
|

[jira] Reopened: (NUTCH-151) CommandRunner can hang after the main thread exec is finished and has inefficient busy loop

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
     [ http://issues.apache.org/jira/browse/NUTCH-151?page=all ]
     
Jerome Charron reopened NUTCH-151:
----------------------------------


Due to the removal of calling barrier in PumperThread the process is always timedout (for instance , unit tests of parse-ext fails) because only the main thread is assumed to be finished.

> CommandRunner can hang after the main thread exec is finished and has inefficient busy loop
> -------------------------------------------------------------------------------------------
>
>          Key: NUTCH-151
>          URL: http://issues.apache.org/jira/browse/NUTCH-151
>      Project: Nutch
>         Type: Bug
>   Components: indexer
>     Versions: 0.8-dev
>  Environment: all
>     Reporter: Paul Baclace
>      Fix For: 0.8-dev
>  Attachments: CommandRunner.java, CommandRunner.java.patch
>
> I encountered a case where the JVM of a Tasktracker child did not exit after the main thread returned; a thread dump showed only the threads named STDOUT and STDERR from CommandRunner as non-daemon threads, and both were doing a read().
> CommandRunner usually works correctly when the subprocess is expected to be finished before the timeout or when no timeout is used. By _usually_, I mean in the absence of external thread interrupts.  The busy loop that waits for the process to finish has a sleep that is skipped over by an exception; this causes the waiting main thread to compete with the subprocess in a tight loop and effectively reduces the available cpu by 50%.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (NUTCH-151) CommandRunner can hang after the main thread exec is finished and has inefficient busy loop

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
     [ http://issues.apache.org/jira/browse/NUTCH-151?page=all ]

Jerome Charron updated NUTCH-151:
---------------------------------

    Attachment: CommandRunner.060110.patch

Here is a very small patch that solves this issue.
If Paul is ok with this, I will commit.

> CommandRunner can hang after the main thread exec is finished and has inefficient busy loop
> -------------------------------------------------------------------------------------------
>
>          Key: NUTCH-151
>          URL: http://issues.apache.org/jira/browse/NUTCH-151
>      Project: Nutch
>         Type: Bug
>   Components: indexer
>     Versions: 0.8-dev
>  Environment: all
>     Reporter: Paul Baclace
>      Fix For: 0.8-dev
>  Attachments: CommandRunner.060110.patch, CommandRunner.java, CommandRunner.java.patch
>
> I encountered a case where the JVM of a Tasktracker child did not exit after the main thread returned; a thread dump showed only the threads named STDOUT and STDERR from CommandRunner as non-daemon threads, and both were doing a read().
> CommandRunner usually works correctly when the subprocess is expected to be finished before the timeout or when no timeout is used. By _usually_, I mean in the absence of external thread interrupts.  The busy loop that waits for the process to finish has a sleep that is skipped over by an exception; this causes the waiting main thread to compete with the subprocess in a tight loop and effectively reduces the available cpu by 50%.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (NUTCH-151) CommandRunner can hang after the main thread exec is finished and has inefficient busy loop

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
    [ http://issues.apache.org/jira/browse/NUTCH-151?page=comments#action_12362383 ]

Paul Baclace commented on NUTCH-151:
------------------------------------

The number of threads that invoke _barrier.barrier() or .attemptBarrier() should match the count passed to the contructor of CyclicBarrier(int), so yes, enter the barrier before returning from PumperThread.run().  

After looking at the latest Bug Parade info on avoiding resource leakage with exec(), I think (Process).destroy() should alway be called on the way out, even if the timeout did not occur.
That would mean:

    } catch (TimeoutException ex) {
      _timedout = true;
      if (_destroyOnTimeout) {
        proc.destroy();
      }
...
    if (_timedout) {
      if (_destroyOnTimeout) {
        proc.destroy();
      }
    }

becomes:
    } catch (TimeoutException ex) {
      _timedout = true;
...
      if (_waitForExit) {
        proc.destroy();
      }

and field _destroyOnTimeout is removed because it is always true and/or is subsumed by _waitForExit.



> CommandRunner can hang after the main thread exec is finished and has inefficient busy loop
> -------------------------------------------------------------------------------------------
>
>          Key: NUTCH-151
>          URL: http://issues.apache.org/jira/browse/NUTCH-151
>      Project: Nutch
>         Type: Bug
>   Components: indexer
>     Versions: 0.8-dev
>  Environment: all
>     Reporter: Paul Baclace
>      Fix For: 0.8-dev
>  Attachments: CommandRunner.060110.patch, CommandRunner.java, CommandRunner.java.patch
>
> I encountered a case where the JVM of a Tasktracker child did not exit after the main thread returned; a thread dump showed only the threads named STDOUT and STDERR from CommandRunner as non-daemon threads, and both were doing a read().
> CommandRunner usually works correctly when the subprocess is expected to be finished before the timeout or when no timeout is used. By _usually_, I mean in the absence of external thread interrupts.  The busy loop that waits for the process to finish has a sleep that is skipped over by an exception; this causes the waiting main thread to compete with the subprocess in a tight loop and effectively reduces the available cpu by 50%.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply | Threaded
Open this post in threaded view
|

[jira] Resolved: (NUTCH-151) CommandRunner can hang after the main thread exec is finished and has inefficient busy loop

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
     [ http://issues.apache.org/jira/browse/NUTCH-151?page=all ]
     
Jerome Charron resolved NUTCH-151:
----------------------------------

    Resolution: Fixed

Changes committed : http://svn.apache.org/viewcvs.cgi?rev=368060&view=rev
Thanks Paul.

> CommandRunner can hang after the main thread exec is finished and has inefficient busy loop
> -------------------------------------------------------------------------------------------
>
>          Key: NUTCH-151
>          URL: http://issues.apache.org/jira/browse/NUTCH-151
>      Project: Nutch
>         Type: Bug
>   Components: indexer
>     Versions: 0.8-dev
>  Environment: all
>     Reporter: Paul Baclace
>      Fix For: 0.8-dev
>  Attachments: CommandRunner.060110.patch, CommandRunner.java, CommandRunner.java.patch
>
> I encountered a case where the JVM of a Tasktracker child did not exit after the main thread returned; a thread dump showed only the threads named STDOUT and STDERR from CommandRunner as non-daemon threads, and both were doing a read().
> CommandRunner usually works correctly when the subprocess is expected to be finished before the timeout or when no timeout is used. By _usually_, I mean in the absence of external thread interrupts.  The busy loop that waits for the process to finish has a sleep that is skipped over by an exception; this causes the waiting main thread to compete with the subprocess in a tight loop and effectively reduces the available cpu by 50%.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira