newbie: Trying to get an overview of the code..

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

newbie: Trying to get an overview of the code..

Peter Thygesen
Maybe this question should be move to the developer list, but .. it is
not that I currently intent to contribute to the project.. I just wanted
to see if I could get the code to compile.

 

This is what I did:

I opened the hadoop project in eclipse by using the "Java project from
existing ant buildfile" wizard. Using the build.xml file.

Fine... but I resulted in 1000's of errors and warnings.

I even tried to check-out the code from svn, and did approximately the
same as above. And having the same result.

 

What am I missing, or doing wrong?

Are there some "secret" JAR's I need to include.

 

Perhaps I'm just plain stupid or something.. ;) ..and yes I'm also
fairly new to java. (having several years of c# experience though..)

Kind regards,

Peter

 

Reply | Threaded
Open this post in threaded view
|

Re: newbie: Trying to get an overview of the code..

Albert Strasheim
Import Hadoop project as an existing Java project without the Ant build thing.

Then stick the attached .classpath in your Hadoop directory and
refresh the project in Eclipse.

Maybe someone wants to check some sensible Eclipse and NetBeans
project files into the SVN repository...

Hope this helps.

Cheers,

Albert

On 10/15/07, Peter Thygesen <[hidden email]> wrote:
> Maybe this question should be move to the developer list, but .. it is
> not that I currently intent to contribute to the project.. I just wanted
> to see if I could get the code to compile.
Reply | Threaded
Open this post in threaded view
|

Re: newbie: Trying to get an overview of the code..

Mark Butler-5
In reply to this post by Peter Thygesen
Peter,

You are in luck. I just posted some guidelines to the Wiki here

http://wiki.apache.org/lucene-hadoop/EclipseEnvironment

on how to get around that - see configuring Eclipse to build Hadoop.

Please have a go at following these instructions, and let me know if
they work or not?

kind regards

Mark

On Mon, 2007-10-15 at 15:53 +0200, Peter Thygesen wrote:

> Maybe this question should be move to the developer list, but .. it is
> not that I currently intent to contribute to the project.. I just wanted
> to see if I could get the code to compile.
>
>  
>
> This is what I did:
>
> I opened the hadoop project in eclipse by using the "Java project from
> existing ant buildfile" wizard. Using the build.xml file.
>
> Fine... but I resulted in 1000's of errors and warnings.
>
> I even tried to check-out the code from svn, and did approximately the
> same as above. And having the same result.
>
>  
>
> What am I missing, or doing wrong?
>
> Are there some "secret" JAR's I need to include.
>
>  
>
> Perhaps I'm just plain stupid or something.. ;) ..and yes I'm also
> fairly new to java. (having several years of c# experience though..)
>
> Kind regards,
>
> Peter
>
>  
>

Reply | Threaded
Open this post in threaded view
|

Re: newbie: Trying to get an overview of the code..

Mark Butler-5
In reply to this post by Albert Strasheim
On Mon, 2007-10-15 at 16:20 +0200, Albert Strasheim wrote:
> Import Hadoop project as an existing Java project without the Ant build thing.
>
> Then stick the attached .classpath in your Hadoop directory and
> refresh the project in Eclipse.
>
> Maybe someone wants to check some sensible Eclipse and NetBeans
> project files into the SVN repository...

This is a good idea. I almost have such a project file for Eclipse
*except* for one problem:

hadoop/src/java/org/apache/record/compiler/ant

requires ant.jar. This is not in the distribution, so we don't know
where it is. Any suggestions on how to overcome this, apart from
including ant.jar?

Mark



Reply | Threaded
Open this post in threaded view
|

Re: newbie: Trying to get an overview of the code..

Enis Soztutar
You should include the ant.jar into the classpath. Ant.jar has been
intentionally removed from the distribution. See the relevant JIRA
issue(i do not remember the url) for further details.

Mark Butler wrote:

> On Mon, 2007-10-15 at 16:20 +0200, Albert Strasheim wrote:
>  
>> Import Hadoop project as an existing Java project without the Ant build thing.
>>
>> Then stick the attached .classpath in your Hadoop directory and
>> refresh the project in Eclipse.
>>
>> Maybe someone wants to check some sensible Eclipse and NetBeans
>> project files into the SVN repository...
>>    
>
> This is a good idea. I almost have such a project file for Eclipse
> *except* for one problem:
>
> hadoop/src/java/org/apache/record/compiler/ant
>
> requires ant.jar. This is not in the distribution, so we don't know
> where it is. Any suggestions on how to overcome this, apart from
> including ant.jar?
>
> Mark
>
>
>
>
>  
Reply | Threaded
Open this post in threaded view
|

RE: newbie: Trying to get an overview of the code..

Peter Thygesen
In reply to this post by Mark Butler-5
I can't get the subversive plug-in to work. It will not access
http://svn.apache.org/repos/asf/lucene/hadoop/, it keep saying "an error
occurred while accessing the repository entry".

However using plain old svn works fine.

Besides "that", I find your description "Configuring Eclipse to build
Hadoop" rather confusing, but I finally got it to work.

Thank you.

Kind regards.
Peter


-----Original Message-----
From: Mark Butler [mailto:[hidden email]]
Sent: 15. oktober 2007 16:25
To: [hidden email]
Subject: Re: newbie: Trying to get an overview of the code..

Peter,

You are in luck. I just posted some guidelines to the Wiki here

http://wiki.apache.org/lucene-hadoop/EclipseEnvironment

on how to get around that - see configuring Eclipse to build Hadoop.

Please have a go at following these instructions, and let me know if
they work or not?

kind regards

Mark

On Mon, 2007-10-15 at 15:53 +0200, Peter Thygesen wrote:
> Maybe this question should be move to the developer list, but .. it is
> not that I currently intent to contribute to the project.. I just
wanted

> to see if I could get the code to compile.
>
>  
>
> This is what I did:
>
> I opened the hadoop project in eclipse by using the "Java project from
> existing ant buildfile" wizard. Using the build.xml file.
>
> Fine... but I resulted in 1000's of errors and warnings.
>
> I even tried to check-out the code from svn, and did approximately the
> same as above. And having the same result.
>
>  
>
> What am I missing, or doing wrong?
>
> Are there some "secret" JAR's I need to include.
>
>  
>
> Perhaps I'm just plain stupid or something.. ;) ..and yes I'm also
> fairly new to java. (having several years of c# experience though..)
>
> Kind regards,
>
> Peter
>
>  
>


Reply | Threaded
Open this post in threaded view
|

MapReduce Job on XML input

Peter Thygesen
I would like to run some mapReduce jobs on some xml files I got (aprox.
100000 compressed files).
The XML files are not that big about 1 Mb compressed, each containing
about 1000 records.

Do I have to write my own InputSplitter? Should I use
MultiFileInputFormat or StreamInputFormat? Can I use the
StreamXmlRecordReader, and how? By sub-classing some input class?

The tutorials and examples I've read are all very straight forward
reading simple text files, but I miss a more complex example, especially
one that reads xml files ;)

thx.
Peter


Reply | Threaded
Open this post in threaded view
|

Re: MapReduce Job on XML input

Ted Dunning-3

That isn't all that many files.  At 1MB, you shouldn't be seeing much
performance hit due to reading many files.

You will need a special input format but it can be very simple.  Just extend
something like TextInputFormat and replace the record reader and report the
file as unsplittable.


On 11/26/07 8:49 AM, "Peter Thygesen" <[hidden email]> wrote:

> I would like to run some mapReduce jobs on some xml files I got (aprox.
> 100000 compressed files).
> The XML files are not that big about 1 Mb compressed, each containing
> about 1000 records.
>
> Do I have to write my own InputSplitter? Should I use
> MultiFileInputFormat or StreamInputFormat? Can I use the
> StreamXmlRecordReader, and how? By sub-classing some input class?
>
> The tutorials and examples I've read are all very straight forward
> reading simple text files, but I miss a more complex example, especially
> one that reads xml files ;)
>
> thx.
> Peter
>
>