[jira] Created: (NUTCH-143) Improper error numbers returned on exit

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (NUTCH-143) Improper error numbers returned on exit

JIRA jira@apache.org
Improper error numbers returned on exit
---------------------------------------

         Key: NUTCH-143
         URL: http://issues.apache.org/jira/browse/NUTCH-143
     Project: Nutch
        Type: Bug
    Versions: 0.8-dev    
    Reporter: Rod Taylor


Nutch does not obey standard command line error numbers which can make it difficult to script around commands.

Both of the below should have exited with an error number larger than 0 causing the shell script to enter into the 'Failed' case.

bash-3.00$ /opt/nutch/bin/nutch updatedb && echo "==>Success" || echo "==>Failed"
Usage: <crawldb> <segment>
==>Success

bash-3.00$ /opt/nutch/bin/nutch readdb && echo "==>Success" || echo "==>Failed"
Usage: CrawlDbReader <crawldb> (-stats | -dump <out_dir> | -url <url>)
        <crawldb>       directory name where crawldb is located
        -stats  print overall statistics to System.out
        -dump <out_dir> dump the whole db to a text file in <out_dir>
        -url <url>      print information on <url> to System.out
==>Success


Note that the nutch shell script functions as expected:

bash-3.00$ /opt/nutch/bin/nutch  && echo "==>Success" || echo "==>Failed"
Usage: nutch COMMAND
where COMMAND is one of:
  crawl             one-step crawler for intranets
  readdb            read / dump crawl db
  readlinkdb        read / dump link db
  admin             database administration, including creation
  inject            inject new urls into the database
  generate          generate new segments to fetch
  fetch             fetch a segment's pages
  parse             parse a segment's pages
  updatedb          update crawl db from segments after fetching
  invertlinks       create a linkdb from parsed segments
  index             run the indexer on parsed segments and linkdb
  merge             merge several segment indexes
  dedup             remove duplicates from a set of segment indexes
  server            run a search server
  namenode          run the NDFS namenode
  datanode          run an NDFS datanode
  ndfs              run an NDFS admin client
  jobtracker        run the MapReduce job Tracker node
  tasktracker       run a MapReduce task Tracker node
  job               manipulate MapReduce jobs
 or
  CLASSNAME         run the class named CLASSNAME
Most commands print help when invoked w/o parameters.
==>Failed


--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (NUTCH-143) Improper error numbers returned on exit

JIRA jira@apache.org
    [ http://issues.apache.org/jira/browse/NUTCH-143?page=comments#action_12360571 ]

Stefan Groschupf commented on NUTCH-143:
----------------------------------------

Would be great in case you can provide a patch.

> Improper error numbers returned on exit
> ---------------------------------------
>
>          Key: NUTCH-143
>          URL: http://issues.apache.org/jira/browse/NUTCH-143
>      Project: Nutch
>         Type: Bug
>     Versions: 0.8-dev
>     Reporter: Rod Taylor

>
> Nutch does not obey standard command line error numbers which can make it difficult to script around commands.
> Both of the below should have exited with an error number larger than 0 causing the shell script to enter into the 'Failed' case.
> bash-3.00$ /opt/nutch/bin/nutch updatedb && echo "==>Success" || echo "==>Failed"
> Usage: <crawldb> <segment>
> ==>Success
> bash-3.00$ /opt/nutch/bin/nutch readdb && echo "==>Success" || echo "==>Failed"
> Usage: CrawlDbReader <crawldb> (-stats | -dump <out_dir> | -url <url>)
>         <crawldb>       directory name where crawldb is located
>         -stats  print overall statistics to System.out
>         -dump <out_dir> dump the whole db to a text file in <out_dir>
>         -url <url>      print information on <url> to System.out
> ==>Success
> Note that the nutch shell script functions as expected:
> bash-3.00$ /opt/nutch/bin/nutch  && echo "==>Success" || echo "==>Failed"
> Usage: nutch COMMAND
> where COMMAND is one of:
>   crawl             one-step crawler for intranets
>   readdb            read / dump crawl db
>   readlinkdb        read / dump link db
>   admin             database administration, including creation
>   inject            inject new urls into the database
>   generate          generate new segments to fetch
>   fetch             fetch a segment's pages
>   parse             parse a segment's pages
>   updatedb          update crawl db from segments after fetching
>   invertlinks       create a linkdb from parsed segments
>   index             run the indexer on parsed segments and linkdb
>   merge             merge several segment indexes
>   dedup             remove duplicates from a set of segment indexes
>   server            run a search server
>   namenode          run the NDFS namenode
>   datanode          run an NDFS datanode
>   ndfs              run an NDFS admin client
>   jobtracker        run the MapReduce job Tracker node
>   tasktracker       run a MapReduce task Tracker node
>   job               manipulate MapReduce jobs
>  or
>   CLASSNAME         run the class named CLASSNAME
> Most commands print help when invoked w/o parameters.
> ==>Failed

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (NUTCH-143) Improper error numbers returned on exit

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
    [ http://issues.apache.org/jira/browse/NUTCH-143?page=comments#action_12360689 ]

Matt Kangas commented on NUTCH-143:
-----------------------------------

I'd like to see this fixed too. It would make error-checking in wrapper scripts much simpler to implement.

A fix would have to touch every .java file that has a main() method, because the problem is that the JVM returns status=0 from main(), because main() has a _void_ return type, after all.

To solve this, I recommend renaming all existing "main()" methods to "doMain()" and adding the following to each affected file:

  /**
   * main() wrapper that returns proper exit status
   */
  public static void main(String[] args) {
    Runtime rt = Runtime.getRuntime();
    try {
      boolean status = doMain(args);
      rt.exit(status ? 0 : 1);
    }
    catch (Exception e) {
      LOG.log(Level.SEVERE, LOGPREFIX + "error, caught Exception in main()", e);      
      rt.exit(1);
    }
  }


> Improper error numbers returned on exit
> ---------------------------------------
>
>          Key: NUTCH-143
>          URL: http://issues.apache.org/jira/browse/NUTCH-143
>      Project: Nutch
>         Type: Bug
>     Versions: 0.8-dev
>     Reporter: Rod Taylor

>
> Nutch does not obey standard command line error numbers which can make it difficult to script around commands.
> Both of the below should have exited with an error number larger than 0 causing the shell script to enter into the 'Failed' case.
> bash-3.00$ /opt/nutch/bin/nutch updatedb && echo "==>Success" || echo "==>Failed"
> Usage: <crawldb> <segment>
> ==>Success
> bash-3.00$ /opt/nutch/bin/nutch readdb && echo "==>Success" || echo "==>Failed"
> Usage: CrawlDbReader <crawldb> (-stats | -dump <out_dir> | -url <url>)
>         <crawldb>       directory name where crawldb is located
>         -stats  print overall statistics to System.out
>         -dump <out_dir> dump the whole db to a text file in <out_dir>
>         -url <url>      print information on <url> to System.out
> ==>Success
> Note that the nutch shell script functions as expected:
> bash-3.00$ /opt/nutch/bin/nutch  && echo "==>Success" || echo "==>Failed"
> Usage: nutch COMMAND
> where COMMAND is one of:
>   crawl             one-step crawler for intranets
>   readdb            read / dump crawl db
>   readlinkdb        read / dump link db
>   admin             database administration, including creation
>   inject            inject new urls into the database
>   generate          generate new segments to fetch
>   fetch             fetch a segment's pages
>   parse             parse a segment's pages
>   updatedb          update crawl db from segments after fetching
>   invertlinks       create a linkdb from parsed segments
>   index             run the indexer on parsed segments and linkdb
>   merge             merge several segment indexes
>   dedup             remove duplicates from a set of segment indexes
>   server            run a search server
>   namenode          run the NDFS namenode
>   datanode          run an NDFS datanode
>   ndfs              run an NDFS admin client
>   jobtracker        run the MapReduce job Tracker node
>   tasktracker       run a MapReduce task Tracker node
>   job               manipulate MapReduce jobs
>  or
>   CLASSNAME         run the class named CLASSNAME
> Most commands print help when invoked w/o parameters.
> ==>Failed

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (NUTCH-143) Improper error numbers returned on exit

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
     [ http://issues.apache.org/jira/browse/NUTCH-143?page=all ]

Rod Taylor updated NUTCH-143:
-----------------------------

    Attachment: errorhandling.patch

Uses proper exit codes for functionality with bin/nutch shortcuts using the suggested method.

> Improper error numbers returned on exit
> ---------------------------------------
>
>          Key: NUTCH-143
>          URL: http://issues.apache.org/jira/browse/NUTCH-143
>      Project: Nutch
>         Type: Bug
>     Versions: 0.8-dev
>     Reporter: Rod Taylor
>  Attachments: errorhandling.patch
>
> Nutch does not obey standard command line error numbers which can make it difficult to script around commands.
> Both of the below should have exited with an error number larger than 0 causing the shell script to enter into the 'Failed' case.
> bash-3.00$ /opt/nutch/bin/nutch updatedb && echo "==>Success" || echo "==>Failed"
> Usage: <crawldb> <segment>
> ==>Success
> bash-3.00$ /opt/nutch/bin/nutch readdb && echo "==>Success" || echo "==>Failed"
> Usage: CrawlDbReader <crawldb> (-stats | -dump <out_dir> | -url <url>)
>         <crawldb>       directory name where crawldb is located
>         -stats  print overall statistics to System.out
>         -dump <out_dir> dump the whole db to a text file in <out_dir>
>         -url <url>      print information on <url> to System.out
> ==>Success
> Note that the nutch shell script functions as expected:
> bash-3.00$ /opt/nutch/bin/nutch  && echo "==>Success" || echo "==>Failed"
> Usage: nutch COMMAND
> where COMMAND is one of:
>   crawl             one-step crawler for intranets
>   readdb            read / dump crawl db
>   readlinkdb        read / dump link db
>   admin             database administration, including creation
>   inject            inject new urls into the database
>   generate          generate new segments to fetch
>   fetch             fetch a segment's pages
>   parse             parse a segment's pages
>   updatedb          update crawl db from segments after fetching
>   invertlinks       create a linkdb from parsed segments
>   index             run the indexer on parsed segments and linkdb
>   merge             merge several segment indexes
>   dedup             remove duplicates from a set of segment indexes
>   server            run a search server
>   namenode          run the NDFS namenode
>   datanode          run an NDFS datanode
>   ndfs              run an NDFS admin client
>   jobtracker        run the MapReduce job Tracker node
>   tasktracker       run a MapReduce task Tracker node
>   job               manipulate MapReduce jobs
>  or
>   CLASSNAME         run the class named CLASSNAME
> Most commands print help when invoked w/o parameters.
> ==>Failed

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply | Threaded
Open this post in threaded view
|

[jira] Closed: (NUTCH-143) Improper error numbers returned on exit

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
     [ http://issues.apache.org/jira/browse/NUTCH-143?page=all ]

Andrzej Bialecki  closed NUTCH-143.
-----------------------------------

    Resolution: Fixed

Fixed in rev. 438670, with modifications.

> Improper error numbers returned on exit
> ---------------------------------------
>
>                 Key: NUTCH-143
>                 URL: http://issues.apache.org/jira/browse/NUTCH-143
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Rod Taylor
>         Attachments: errorhandling.patch
>
>
> Nutch does not obey standard command line error numbers which can make it difficult to script around commands.
> Both of the below should have exited with an error number larger than 0 causing the shell script to enter into the 'Failed' case.
> bash-3.00$ /opt/nutch/bin/nutch updatedb && echo "==>Success" || echo "==>Failed"
> Usage: <crawldb> <segment>
> ==>Success
> bash-3.00$ /opt/nutch/bin/nutch readdb && echo "==>Success" || echo "==>Failed"
> Usage: CrawlDbReader <crawldb> (-stats | -dump <out_dir> | -url <url>)
>         <crawldb>       directory name where crawldb is located
>         -stats  print overall statistics to System.out
>         -dump <out_dir> dump the whole db to a text file in <out_dir>
>         -url <url>      print information on <url> to System.out
> ==>Success
> Note that the nutch shell script functions as expected:
> bash-3.00$ /opt/nutch/bin/nutch  && echo "==>Success" || echo "==>Failed"
> Usage: nutch COMMAND
> where COMMAND is one of:
>   crawl             one-step crawler for intranets
>   readdb            read / dump crawl db
>   readlinkdb        read / dump link db
>   admin             database administration, including creation
>   inject            inject new urls into the database
>   generate          generate new segments to fetch
>   fetch             fetch a segment's pages
>   parse             parse a segment's pages
>   updatedb          update crawl db from segments after fetching
>   invertlinks       create a linkdb from parsed segments
>   index             run the indexer on parsed segments and linkdb
>   merge             merge several segment indexes
>   dedup             remove duplicates from a set of segment indexes
>   server            run a search server
>   namenode          run the NDFS namenode
>   datanode          run an NDFS datanode
>   ndfs              run an NDFS admin client
>   jobtracker        run the MapReduce job Tracker node
>   tasktracker       run a MapReduce task Tracker node
>   job               manipulate MapReduce jobs
>  or
>   CLASSNAME         run the class named CLASSNAME
> Most commands print help when invoked w/o parameters.
> ==>Failed

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira