[jira] Created: (HADOOP-1521) Hadoop mapreduce should always ship the jar file(s) specified by the user

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (HADOOP-1521) Hadoop mapreduce should always ship the jar file(s) specified by the user

Tim Allison (Jira)
Hadoop  mapreduce should always ship the jar file(s) specified by the user
--------------------------------------------------------------------------

                 Key: HADOOP-1521
                 URL: https://issues.apache.org/jira/browse/HADOOP-1521
             Project: Hadoop
          Issue Type: Bug
          Components: mapred
            Reporter: Runping Qi


when I run a hadoop job like:

    bin/hadoop jar myjar org.apache.hadoop.mapred.lib.aggregate.ValueAggregatorJob other_args

myjar is not shipped. The job failed because the class loader cannot find the classes specified in myjar.



--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1521) Hadoop mapreduce should always ship the jar file(s) specified by the user

Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507566 ]

Owen O'Malley commented on HADOOP-1521:
---------------------------------------

I don't understand what went wrong. Did the jar not exist? Was the representative class found in another jar? Applications can explicitly specify a jar file and that jar file WILL be used. Most applications use the method that sets the jar file based on a representative class. That can fail if the representative class is both in the "hadoop system" jars and the user's jar file.

> Hadoop  mapreduce should always ship the jar file(s) specified by the user
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-1521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1521
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>
> when I run a hadoop job like:
>     bin/hadoop jar myjar org.apache.hadoop.mapred.lib.aggregate.ValueAggregatorJob other_args
> myjar is not shipped. The job failed because the class loader cannot find the classes specified in myjar.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1521) Hadoop mapreduce should always ship the jar file(s) specified by the user

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507568 ]

Runping Qi commented on HADOOP-1521:
------------------------------------


The job was a Abacus (Aggregate) job.
Aggrigate is in hadoop-core now,
The main class is in hadoop-core.jar.
When the main class creates the jobconf, it has no idea what will be the user's plugin classes, nor  what jar will hold the users code.  Using the representative class to decide
which jar to ship is flawed.
I specified explicitly a jar containing my classes, but the jar file was ignored, not shipped.
That is wrong.



> Hadoop  mapreduce should always ship the jar file(s) specified by the user
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-1521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1521
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>
> when I run a hadoop job like:
>     bin/hadoop jar myjar org.apache.hadoop.mapred.lib.aggregate.ValueAggregatorJob other_args
> myjar is not shipped. The job failed because the class loader cannot find the classes specified in myjar.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Assigned: (HADOOP-1521) Hadoop mapreduce should always ship the jar file(s) specified by the user

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/HADOOP-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Enis Soztutar reassigned HADOOP-1521:
-------------------------------------

    Assignee: Enis Soztutar

> Hadoop  mapreduce should always ship the jar file(s) specified by the user
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-1521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1521
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>            Assignee: Enis Soztutar
>
> when I run a hadoop job like:
>     bin/hadoop jar myjar org.apache.hadoop.mapred.lib.aggregate.ValueAggregatorJob other_args
> myjar is not shipped. The job failed because the class loader cannot find the classes specified in myjar.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1521) Hadoop mapreduce should always ship the jar file(s) specified by the user

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507867 ]

Enis Soztutar commented on HADOOP-1521:
---------------------------------------

Well i had the same error. This is definitely caused by
{{JobConf theJob = new JobConf(ValueAggregatorJob.class);}} in {{ValueAggregatorJob#createValueAggregatorJob}}.
i will attach a patch to {{ValueAggregatorJob}} to take the representative class name as argument.

> Hadoop  mapreduce should always ship the jar file(s) specified by the user
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-1521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1521
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>
> when I run a hadoop job like:
>     bin/hadoop jar myjar org.apache.hadoop.mapred.lib.aggregate.ValueAggregatorJob other_args
> myjar is not shipped. The job failed because the class loader cannot find the classes specified in myjar.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (HADOOP-1521) Hadoop mapreduce should always ship the jar file(s) specified by the user

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/HADOOP-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Enis Soztutar updated HADOOP-1521:
----------------------------------

    Attachment: valueAggregator_v1.0.patch

Runping could you try this patch please

> Hadoop  mapreduce should always ship the jar file(s) specified by the user
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-1521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1521
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>            Assignee: Enis Soztutar
>         Attachments: valueAggregator_v1.0.patch
>
>
> when I run a hadoop job like:
>     bin/hadoop jar myjar org.apache.hadoop.mapred.lib.aggregate.ValueAggregatorJob other_args
> myjar is not shipped. The job failed because the class loader cannot find the classes specified in myjar.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1521) Hadoop mapreduce should always ship the jar file(s) specified by the user

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507899 ]

Runping Qi commented on HADOOP-1521:
------------------------------------

Well, this patch may work (I have not tried yet, but I believe the approach will work) for this case, but I think  it only address this specific case. Plus, it does not sound right
logically to make ValueAggregatorJob  take a representative class name
(think about how do you explain it?). The right solution is that the user should be able to
specify any jar(s) and Hadoop should ship the jar(s) and put them on the class path in the
executing environment.


> Hadoop  mapreduce should always ship the jar file(s) specified by the user
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-1521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1521
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>            Assignee: Enis Soztutar
>         Attachments: valueAggregator_v1.0.patch
>
>
> when I run a hadoop job like:
>     bin/hadoop jar myjar org.apache.hadoop.mapred.lib.aggregate.ValueAggregatorJob other_args
> myjar is not shipped. The job failed because the class loader cannot find the classes specified in myjar.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1521) Hadoop mapreduce should always ship the jar file(s) specified by the user

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507968 ]

Enis Soztutar commented on HADOOP-1521:
---------------------------------------

>think about how do you explain it
The javadoc of JobConf explains this as :
{code}  
/**
   * Construct a map/reduce job configuration.
   * @param exampleClass a class whose containing jar is used as the job's jar.
   */
  public JobConf(Class exampleClass) {
    initialize();
    setJarByClass(exampleClass);
  }
{code}


>The right solution is that the user should be able to specify any jar(s) and Hadoop should ship the jar(s) and put them on the class path in the
>executing environment.

We could set a system property from {{JobRunner}} to the jar file argument, and then initialize {{JobConf}} s with with this jar from the empty constructor. However i am not sure if this is what we want. Are there any other votes for this issue?


> Hadoop  mapreduce should always ship the jar file(s) specified by the user
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-1521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1521
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>            Assignee: Enis Soztutar
>         Attachments: valueAggregator_v1.0.patch
>
>
> when I run a hadoop job like:
>     bin/hadoop jar myjar org.apache.hadoop.mapred.lib.aggregate.ValueAggregatorJob other_args
> myjar is not shipped. The job failed because the class loader cannot find the classes specified in myjar.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1521) Hadoop mapreduce should always ship the jar file(s) specified by the user

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511104 ]

Enis Soztutar commented on HADOOP-1521:
---------------------------------------

Currently we ship only one job file with the application, and this job file is searched by the help of a class in the job file. Unless we do decide to take the above approach(which i personally don't like), i think it is the aggragetor's responsibility to obtain the client class(or jar file). In able to proceed  I suggest we fire a new issue, apply the patch  there, and resolve this issue as wont fix. Any thoughts ?


> Hadoop  mapreduce should always ship the jar file(s) specified by the user
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-1521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1521
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>            Assignee: Enis Soztutar
>         Attachments: valueAggregator_v1.0.patch
>
>
> when I run a hadoop job like:
>     bin/hadoop jar myjar org.apache.hadoop.mapred.lib.aggregate.ValueAggregatorJob other_args
> myjar is not shipped. The job failed because the class loader cannot find the classes specified in myjar.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (HADOOP-1521) Hadoop mapreduce should always ship the jar file(s) specified by the user

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/HADOOP-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511255 ]

Runping Qi commented on HADOOP-1521:
------------------------------------


Patch for HADOOP-1547 makes Aggregate work.
Use this issue for the general feature for allowing the user to specify jar files
a job depends on.


> Hadoop  mapreduce should always ship the jar file(s) specified by the user
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-1521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1521
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>            Assignee: Enis Soztutar
>         Attachments: valueAggregator_v1.0.patch
>
>
> when I run a hadoop job like:
>     bin/hadoop jar myjar org.apache.hadoop.mapred.lib.aggregate.ValueAggregatorJob other_args
> myjar is not shipped. The job failed because the class loader cannot find the classes specified in myjar.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.