Specifying external jars in the classpath for Hadoop

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Specifying external jars in the classpath for Hadoop

alakshman
Hi

I have a map/reduce job that uses external jar files. How do I specify those
jars in the classpath when submitting the mapred job using ./hadoop jar ....
? Suppose my map job relies on API in some external jar how do I pass this
jar file as part of my job submission.

Thanks
A
Reply | Threaded
Open this post in threaded view
|

Re: Specifying external jars in the classpath for Hadoop

Eyal Oren
On 08/13/07/08/07 16:49 -0700, Phantom wrote:
>Hi
>
>I have a map/reduce job that uses external jar files. How do I specify those
>jars in the classpath when submitting the mapred job using ./hadoop jar ....
>? Suppose my map job relies on API in some external jar how do I pass this
>jar file as part of my job submission.
As far as I understand (that's what we do anyway), you have to submit one
jar that contains all your dependencies (except for dependencies on hadoop
libs), including external jars. The easiest is probably to build maven/ant
to build such "big" jar externally with all its dependency jars unpacked
and added into the jar, and then submit them to hadoop.

  -eyal
Reply | Threaded
Open this post in threaded view
|

Re: Specifying external jars in the classpath for Hadoop

Dennis Kubes-2
HADOOP-1622 fixes this to allow multiple resources, including jars, to
be submitted for a single mapreduce job.  There is currently a patch
that "works" but still needs a little fixing.  I should be able to get
it finished in the next couple of days.

Dennis Kubes

Eyal Oren wrote:

> On 08/13/07/08/07 16:49 -0700, Phantom wrote:
>> Hi
>>
>> I have a map/reduce job that uses external jar files. How do I specify
>> those
>> jars in the classpath when submitting the mapred job using ./hadoop
>> jar ....
>> ? Suppose my map job relies on API in some external jar how do I pass
>> this
>> jar file as part of my job submission.
> As far as I understand (that's what we do anyway), you have to submit
> one jar that contains all your dependencies (except for dependencies on
> hadoop libs), including external jars. The easiest is probably to build
> maven/ant to build such "big" jar externally with all its dependency
> jars unpacked and added into the jar, and then submit them to hadoop.
>
>  -eyal
Reply | Threaded
Open this post in threaded view
|

RE: Specifying external jars in the classpath for Hadoop

Joydeep Sen Sarma
In reply to this post by Eyal Oren
i found depositing required jars into the lib directory works just great
(all those jars are prepended to the classpath by the hadoop script).

Any flaws doing it this way?

-----Original Message-----
From: Eyal Oren [mailto:[hidden email]]
Sent: Tuesday, August 14, 2007 12:45 AM
To: [hidden email]
Subject: Re: Specifying external jars in the classpath for Hadoop

On 08/13/07/08/07 16:49 -0700, Phantom wrote:
>Hi
>
>I have a map/reduce job that uses external jar files. How do I specify
those
>jars in the classpath when submitting the mapred job using ./hadoop jar
....
>? Suppose my map job relies on API in some external jar how do I pass
this
>jar file as part of my job submission.
As far as I understand (that's what we do anyway), you have to submit
one
jar that contains all your dependencies (except for dependencies on
hadoop
libs), including external jars. The easiest is probably to build
maven/ant
to build such "big" jar externally with all its dependency jars unpacked

and added into the jar, and then submit them to hadoop.

  -eyal
Reply | Threaded
Open this post in threaded view
|

Re: Specifying external jars in the classpath for Hadoop

Dennis Kubes-2
No, other than you have to have the jars on all machines and it only
supports jar files.

Dennis Kubes

Joydeep Sen Sarma wrote:

> i found depositing required jars into the lib directory works just great
> (all those jars are prepended to the classpath by the hadoop script).
>
> Any flaws doing it this way?
>
> -----Original Message-----
> From: Eyal Oren [mailto:[hidden email]]
> Sent: Tuesday, August 14, 2007 12:45 AM
> To: [hidden email]
> Subject: Re: Specifying external jars in the classpath for Hadoop
>
> On 08/13/07/08/07 16:49 -0700, Phantom wrote:
>> Hi
>>
>> I have a map/reduce job that uses external jar files. How do I specify
> those
>> jars in the classpath when submitting the mapred job using ./hadoop jar
> ....
>> ? Suppose my map job relies on API in some external jar how do I pass
> this
>> jar file as part of my job submission.
> As far as I understand (that's what we do anyway), you have to submit
> one
> jar that contains all your dependencies (except for dependencies on
> hadoop
> libs), including external jars. The easiest is probably to build
> maven/ant
> to build such "big" jar externally with all its dependency jars unpacked
>
> and added into the jar, and then submit them to hadoop.
>
>   -eyal
Reply | Threaded
Open this post in threaded view
|

Re: Specifying external jars in the classpath for Hadoop

Ted Dunning-3
In reply to this post by Joydeep Sen Sarma

That was what Doug recommended last time we talked.


On 8/14/07 8:33 AM, "Joydeep Sen Sarma" <[hidden email]> wrote:

> i found depositing required jars into the lib directory works just great
> (all those jars are prepended to the classpath by the hadoop script).
>
> Any flaws doing it this way?
>

Reply | Threaded
Open this post in threaded view
|

Re: Specifying external jars in the classpath for Hadoop

Doug Cutting
In reply to this post by Eyal Oren
Eyal Oren wrote:
> As far as I understand (that's what we do anyway), you have to submit
> one jar that contains all your dependencies (except for dependencies on
> hadoop libs), including external jars. The easiest is probably to build
> maven/ant to build such "big" jar externally with all its dependency
> jars unpacked and added into the jar, and then submit them to hadoop.

Jar files can also be placed in a 'lib/' directory of a job jar file.
Such jars will be included in the classpath of task JVMs.

Doug
Reply | Threaded
Open this post in threaded view
|

RE: Specifying external jars in the classpath for Hadoop

Avinash Lakshman-2
In reply to this post by Ted Dunning-3
For now I have resorted to the hack of unjarring third party jars and
packaging them all into one big jar.

Thanks
A

-----Original Message-----
From: Ted Dunning [mailto:[hidden email]]
Sent: Tuesday, August 14, 2007 8:43 AM
To: [hidden email]
Subject: Re: Specifying external jars in the classpath for Hadoop


That was what Doug recommended last time we talked.


On 8/14/07 8:33 AM, "Joydeep Sen Sarma" <[hidden email]> wrote:

> i found depositing required jars into the lib directory works just
great
> (all those jars are prepended to the classpath by the hadoop script).
>
> Any flaws doing it this way?
>