Mapper Out of Memory

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Mapper Out of Memory

Rui Shi

Hi,

I run hadoop on a BSD4 clusters and each map task is a gzip file (about 10MB). Some tasks finished. But many of them failed due to heap out of memory. I got the following syslogs:

2007-12-06 12:16:50,277 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2007-12-06 12:16:53,128 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 256
2007-12-06 12:16:53,638 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2007-12-06 12:18:19,079 WARN org.apache.hadoop.mapred.TaskTracker: Error running child
java.lang.OutOfMemoryError: Java heap space
Does anyone know what is the reason and how should we avoid it?

Thanks,

Rui





      ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page.
http://www.yahoo.com/r/hs
Reply | Threaded
Open this post in threaded view
|

RE: Mapper Out of Memory

Joydeep Sen Sarma
Can control heap size using 'mapred.child.java.opts' option.

Check ur program logic though. Personal experience is that running out
of heap space in map task usually suggests some runaway logic somewhere.

-----Original Message-----
From: Rui Shi [mailto:[hidden email]]
Sent: Thursday, December 06, 2007 12:31 PM
To: [hidden email]
Subject: Mapper Out of Memory


Hi,

I run hadoop on a BSD4 clusters and each map task is a gzip file (about
10MB). Some tasks finished. But many of them failed due to heap out of
memory. I got the following syslogs:

2007-12-06 12:16:50,277 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=MAP, sessionId=
2007-12-06 12:16:53,128 INFO org.apache.hadoop.mapred.MapTask:
numReduceTasks: 256
2007-12-06 12:16:53,638 WARN org.apache.hadoop.util.NativeCodeLoader:
Unable to load native-hadoop library for your platform... using
builtin-java classes where applicable
2007-12-06 12:18:19,079 WARN org.apache.hadoop.mapred.TaskTracker: Error
running child
java.lang.OutOfMemoryError: Java heap space
Does anyone know what is the reason and how should we avoid it?

Thanks,

Rui





 
________________________________________________________________________
____________
Never miss a thing.  Make Yahoo your home page.
http://www.yahoo.com/r/hs
Reply | Threaded
Open this post in threaded view
|

Re: Mapper Out of Memory

Peter W.-3
In reply to this post by Rui Shi
Hello,

There is a setting in hadoop-0.15.0/bin/rcc

default:

JAVA_HEAP_MAX=-Xmx1000m

For 2GB memory you can set this about:

JAVA_HEAP_MAX=-Xmx1700m

2048m is the highest allowed setting on a mac, linux,
non-solaris unix or windows box.

Peter W.

On Dec 6, 2007, at 12:30 PM, Rui Shi wrote:

>
> Hi,
>
> I run hadoop on a BSD4 clusters and each map task is a gzip file  
> (about 10MB). Some tasks finished. But many of them failed due to  
> heap out of memory. I got the following syslogs:
>
> 2007-12-06 12:16:50,277 INFO  
> org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics  
> with processName=MAP, sessionId=
> 2007-12-06 12:16:53,128 INFO org.apache.hadoop.mapred.MapTask:  
> numReduceTasks: 256
> 2007-12-06 12:16:53,638 WARN  
> org.apache.hadoop.util.NativeCodeLoader: Unable to load native-
> hadoop library for your platform... using builtin-java classes  
> where applicable
> 2007-12-06 12:18:19,079 WARN org.apache.hadoop.mapred.TaskTracker:  
> Error running child
> java.lang.OutOfMemoryError: Java heap space
> Does anyone know what is the reason and how should we avoid it?
>
> Thanks,
>
> Rui
Reply | Threaded
Open this post in threaded view
|

Re: Mapper Out of Memory

Rui Shi
In reply to this post by Rui Shi

Hi,

It is hard to believe that you need to enlarge heap size given the input size is only 10MB. In particular, you don't load all input at the same time. As for the program logic, no much fancy stuff, mostly cut and sorting. So GC should be able to handle...

Thanks,

Rui


----- Original Message ----
From: Joydeep Sen Sarma <[hidden email]>
To: [hidden email]
Sent: Thursday, December 6, 2007 1:14:51 PM
Subject: RE: Mapper Out of Memory


Can control heap size using 'mapred.child.java.opts' option.

Check ur program logic though. Personal experience is that running out
of heap space in map task usually suggests some runaway logic
 somewhere.

-----Original Message-----
From: Rui Shi [mailto:[hidden email]]
Sent: Thursday, December 06, 2007 12:31 PM
To: [hidden email]
Subject: Mapper Out of Memory


Hi,

I run hadoop on a BSD4 clusters and each map task is a gzip file (about
10MB). Some tasks finished. But many of them failed due to heap out of
memory. I got the following syslogs:

2007-12-06 12:16:50,277 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=MAP, sessionId=
2007-12-06 12:16:53,128 INFO org.apache.hadoop.mapred.MapTask:
numReduceTasks: 256
2007-12-06 12:16:53,638 WARN org.apache.hadoop.util.NativeCodeLoader:
Unable to load native-hadoop library for your platform... using
builtin-java classes where applicable
2007-12-06 12:18:19,079 WARN org.apache.hadoop.mapred.TaskTracker:
 Error
running child
java.lang.OutOfMemoryError: Java heap space
Does anyone know what is the reason and how should we avoid it?

Thanks,

Rui





 
________________________________________________________________________
____________
Never miss a thing.  Make Yahoo your home page.
http://www.yahoo.com/r/hs






      ____________________________________________________________________________________
Looking for last minute shopping deals?  
Find them fast with Yahoo! Search.  http://tools.search.yahoo.com/newsearch/category.php?category=shopping
Reply | Threaded
Open this post in threaded view
|

Re: Mapper Out of Memory

Doug Cutting
Rui Shi wrote:
> It is hard to believe that you need to enlarge heap size given the input size is only 10MB. In particular, you don't load all input at the same time. As for the program logic, no much fancy stuff, mostly cut and sorting. So GC should be able to handle...

Out-of-memory exceptions can also be caused by having too many files
open at once.  What does 'ulimit -n' show?

You presented an excerpt from a jobtracker log, right?  What do the
tasktracker logs show?

Can you monitor a node while it is running to see whether the jvm's heap
is growing, or whether the number of open files (lsof -p) is large?

Also, can you please provide more details about your application?  I.e.,
what is your inputformat, map function, etc.

Doug
Reply | Threaded
Open this post in threaded view
|

Re: Mapper Out of Memory

Ted Dunning-3
In reply to this post by Rui Shi

There is a bug in the GZipInputStream on java 1.5 that can cause an
out-of-memory error on a malformed gzip input.

It is possible that you are trying to treat this input as a splittable file
which is causing your maps to be fed from chunks of the gzip file.  Those
chunks would be ill-formed, of course, and it is possible that this is
causing an out-of-memory condition.

I am just speculating, however.  To confirm or discard this possibility, you
should examine the stack traces for the maps that are falling over.

On 12/6/07 2:05 PM, "Rui Shi" <[hidden email]> wrote:

>
> Hi,
>
> It is hard to believe that you need to enlarge heap size given the input size
> is only 10MB. In particular, you don't load all input at the same time. As for
> the program logic, no much fancy stuff, mostly cut and sorting. So GC should
> be able to handle...
>
> Thanks,
>
> Rui
>
>
> ----- Original Message ----
> From: Joydeep Sen Sarma <[hidden email]>
> To: [hidden email]
> Sent: Thursday, December 6, 2007 1:14:51 PM
> Subject: RE: Mapper Out of Memory
>
>
> Can control heap size using 'mapred.child.java.opts' option.
>
> Check ur program logic though. Personal experience is that running out
> of heap space in map task usually suggests some runaway logic
>  somewhere.
>
> -----Original Message-----
> From: Rui Shi [mailto:[hidden email]]
> Sent: Thursday, December 06, 2007 12:31 PM
> To: [hidden email]
> Subject: Mapper Out of Memory
>
>
> Hi,
>
> I run hadoop on a BSD4 clusters and each map task is a gzip file (about
> 10MB). Some tasks finished. But many of them failed due to heap out of
> memory. I got the following syslogs:
>
> 2007-12-06 12:16:50,277 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=MAP, sessionId=
> 2007-12-06 12:16:53,128 INFO org.apache.hadoop.mapred.MapTask:
> numReduceTasks: 256
> 2007-12-06 12:16:53,638 WARN org.apache.hadoop.util.NativeCodeLoader:
> Unable to load native-hadoop library for your platform... using
> builtin-java classes where applicable
> 2007-12-06 12:18:19,079 WARN org.apache.hadoop.mapred.TaskTracker:
>  Error
> running child
> java.lang.OutOfMemoryError: Java heap space
> Does anyone know what is the reason and how should we avoid it?
>
> Thanks,
>
> Rui
>
>
>
>
>
>  
> ________________________________________________________________________
> ____________
> Never miss a thing.  Make Yahoo your home page.
> http://www.yahoo.com/r/hs
>
>
>
>
>
>
>      
> ______________________________________________________________________________
> ______
> Looking for last minute shopping deals?
> Find them fast with Yahoo! Search.
> http://tools.search.yahoo.com/newsearch/category.php?category=shopping

Reply | Threaded
Open this post in threaded view
|

Re: Mapper Out of Memory

Michael Bieniosek-2
The JDK also provides "jmap -histo PID" which will give you some crude information about where the memory is going.

-Michael

On 12/6/07 2:16 PM, "Ted Dunning" <[hidden email]> wrote:



There is a bug in the GZipInputStream on java 1.5 that can cause an
out-of-memory error on a malformed gzip input.

It is possible that you are trying to treat this input as a splittable file
which is causing your maps to be fed from chunks of the gzip file.  Those
chunks would be ill-formed, of course, and it is possible that this is
causing an out-of-memory condition.

I am just speculating, however.  To confirm or discard this possibility, you
should examine the stack traces for the maps that are falling over.

On 12/6/07 2:05 PM, "Rui Shi" <[hidden email]> wrote:

>
> Hi,
>
> It is hard to believe that you need to enlarge heap size given the input size
> is only 10MB. In particular, you don't load all input at the same time. As for
> the program logic, no much fancy stuff, mostly cut and sorting. So GC should
> be able to handle...
>
> Thanks,
>
> Rui
>
>
> ----- Original Message ----
> From: Joydeep Sen Sarma <[hidden email]>
> To: [hidden email]
> Sent: Thursday, December 6, 2007 1:14:51 PM
> Subject: RE: Mapper Out of Memory
>
>
> Can control heap size using 'mapred.child.java.opts' option.
>
> Check ur program logic though. Personal experience is that running out
> of heap space in map task usually suggests some runaway logic
>  somewhere.
>
> -----Original Message-----
> From: Rui Shi [mailto:[hidden email]]
> Sent: Thursday, December 06, 2007 12:31 PM
> To: [hidden email]
> Subject: Mapper Out of Memory
>
>
> Hi,
>
> I run hadoop on a BSD4 clusters and each map task is a gzip file (about
> 10MB). Some tasks finished. But many of them failed due to heap out of
> memory. I got the following syslogs:
>
> 2007-12-06 12:16:50,277 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=MAP, sessionId=
> 2007-12-06 12:16:53,128 INFO org.apache.hadoop.mapred.MapTask:
> numReduceTasks: 256
> 2007-12-06 12:16:53,638 WARN org.apache.hadoop.util.NativeCodeLoader:
> Unable to load native-hadoop library for your platform... using
> builtin-java classes where applicable
> 2007-12-06 12:18:19,079 WARN org.apache.hadoop.mapred.TaskTracker:
>  Error
> running child
> java.lang.OutOfMemoryError: Java heap space
> Does anyone know what is the reason and how should we avoid it?
>
> Thanks,
>
> Rui
>
>
>
>
>
>
> ________________________________________________________________________
> ____________
> Never miss a thing.  Make Yahoo your home page.
> http://www.yahoo.com/r/hs
>
>
>
>
>
>
>
> ______________________________________________________________________________
> ______
> Looking for last minute shopping deals?
> Find them fast with Yahoo! Search.
> http://tools.search.yahoo.com/newsearch/category.php?category=shopping



Reply | Threaded
Open this post in threaded view
|

Re: Mapper Out of Memory

Rui Shi
In reply to this post by Rui Shi
Hi,

Out-of-memory exceptions can also be caused by having too many files
open at once.  What does 'ulimit -n' show?

29491

You presented an excerpt from a jobtracker log, right?  What do the
tasktracker logs show?

I saw the some warning in the tasktracker log:

2007-12-06 12:23:41,604 WARN org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50050, call progress(task_200712031900_0014_m_000058_0, 9.126612E-12, hdfs:///usr/ruish/400.gz:0+9528361, MAP, org.apache.hadoop.mapred.Counters@11c135c) from: output error
java.nio.channels.ClosedChannelException
        at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:125)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:294)
        at org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer(SocketChannelOutputStream.java:108)
        at org.apache.hadoop.ipc.SocketChannelOutputStream.write(SocketChannelOutputStream.java:89)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
        at java.io.DataOutputStream.flush(DataOutputStream.java:106)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:585)
And in the datanode logs:

2007-12-06 14:42:20,831 ERROR org.apache.hadoop.dfs.DataNode: DataXceiver: java.io.IOException: Block blk_-8176614602638949879 is valid, and cannot be written to.
        at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:515)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:822)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:727)
        at java.lang.Thread.run(Thread.java:595)

Also, can you please provide more details about your application?
  I.e.,
what is your inputformat, map function, etc.

Very simple stuff, projecting certain fields as key and sorting. The input is gzipped files in which each line has some fields separated by  a delimiter.

Doug






      ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page.
http://www.yahoo.com/r/hs
Reply | Threaded
Open this post in threaded view
|

Re: Mapper Out of Memory

Owen O'Malley-4
In reply to this post by Rui Shi

On Dec 6, 2007, at 12:30 PM, Rui Shi wrote:

> Does anyone know what is the reason and how should we avoid it?

Java 6 gives a little better information in the form of a stack  
trace. My patch HADOOP-2367 will also help after it is finished and  
committed.  It will allow you to get cpu and heap summaries from  
representative tasks.

-- Owen
Reply | Threaded
Open this post in threaded view
|

Re: Mapper Out of Memory

Rui Shi
In reply to this post by Rui Shi
Hi,

I did some experiments on a single Linux machine. I generated some data using the 'random writer' and use the 'sort' in the hadoop-examples to sort them. I still got some out of memory exceptions as follows:

java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Unknown Source)
        at java.io.ByteArrayOutputStream.write(Unknown Source)
        at java.io.DataOutputStream.write(Unknown Source)
        at org.apache.hadoop.io.BytesWritable.write(BytesWritable.java:137)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:340)
        at org.apache.hadoop.mapred.lib.IdentityMapper.map(IdentityMapper.java:39)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1777)
Any ideas?

Thanks,

Rui

----- Original Message ----
From: Rui Shi <[hidden email]>
To: [hidden email]
Sent: Thursday, December 6, 2007 5:56:42 PM
Subject: Re: Mapper Out of Memory


Hi,

Out-of-memory exceptions can also be caused by having too many files
open at once.  What does 'ulimit -n' show?

29491

You presented an excerpt from a jobtracker log, right?  What do the
tasktracker logs show?

I saw the some warning in the tasktracker log:

2007-12-06 12:23:41,604 WARN org.apache.hadoop.ipc.Server: IPC Server
 handler 0 on 50050, call progress(task_200712031900_0014_m_000058_0,
 9.126612E-12, hdfs:///usr/ruish/400.gz:0+9528361, MAP,
 org.apache.hadoop.mapred.Counters@11c135c) from: output error
java.nio.channels.ClosedChannelException
    at
 sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:125)
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:294)
    at
 org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer(SocketChannelOutputStream.java:108)
    at
 org.apache.hadoop.ipc.SocketChannelOutputStream.write(SocketChannelOutputStream.java:89)
    at
 java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
    at
 java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
    at java.io.DataOutputStream.flush(DataOutputStream.java:106)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:585)
And in the datanode logs:

2007-12-06 14:42:20,831 ERROR org.apache.hadoop.dfs.DataNode:
 DataXceiver: java.io.IOException: Block blk_-8176614602638949879 is valid, and
 cannot be written to.
    at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:515)
    at
 org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:822)
    at
 org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:727)
    at java.lang.Thread.run(Thread.java:595)

Also, can you please provide more details about your application?
  I.e.,
what is your inputformat, map function, etc.

Very simple stuff, projecting certain fields as key and sorting. The
 input is gzipped files in which each line has some fields separated by  a
 delimiter.

Doug






   
  ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page.
http://www.yahoo.com/r/hs





      ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page.
http://www.yahoo.com/r/hs
Reply | Threaded
Open this post in threaded view
|

RE: Mapper Out of Memory

Devaraj Das
Was the value of mapred.child.java.opts set to something like 512MB ? What's
the io.sort.mb set to?

> -----Original Message-----
> From: Rui Shi [mailto:[hidden email]]
> Sent: Sunday, December 09, 2007 6:02 AM
> To: [hidden email]
> Subject: Re: Mapper Out of Memory
>
> Hi,
>
> I did some experiments on a single Linux machine. I generated
> some data using the 'random writer' and use the 'sort' in the
> hadoop-examples to sort them. I still got some out of memory
> exceptions as follows:
>
> java.lang.OutOfMemoryError: Java heap space
> at java.util.Arrays.copyOf(Unknown Source)
> at java.io.ByteArrayOutputStream.write(Unknown Source)
> at java.io.DataOutputStream.write(Unknown Source)
> at
> org.apache.hadoop.io.BytesWritable.write(BytesWritable.java:137)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTa
sk.java:340)

> at
> org.apache.hadoop.mapred.lib.IdentityMapper.map(IdentityMapper
> .java:39)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
> at
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1777)
> Any ideas?
>
> Thanks,
>
> Rui
>
> ----- Original Message ----
> From: Rui Shi <[hidden email]>
> To: [hidden email]
> Sent: Thursday, December 6, 2007 5:56:42 PM
> Subject: Re: Mapper Out of Memory
>
>
> Hi,
>
> Out-of-memory exceptions can also be caused by having too
> many files open at once.  What does 'ulimit -n' show?
>
> 29491
>
> You presented an excerpt from a jobtracker log, right?  What
> do the tasktracker logs show?
>
> I saw the some warning in the tasktracker log:
>
> 2007-12-06 12:23:41,604 WARN org.apache.hadoop.ipc.Server:
> IPC Server  handler 0 on 50050, call
> progress(task_200712031900_0014_m_000058_0,
>  9.126612E-12, hdfs:///usr/ruish/400.gz:0+9528361, MAP,
>  org.apache.hadoop.mapred.Counters@11c135c) from: output
> error java.nio.channels.ClosedChannelException
>     at
>  
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl
> .java:125)
>     at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:294)
>     at
>  
> org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer(So
> cketChannelOutputStream.java:108)
>     at
>  
> org.apache.hadoop.ipc.SocketChannelOutputStream.write(SocketCh
> annelOutputStream.java:89)
>     at
>  
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>     at
>  java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>     at java.io.DataOutputStream.flush(DataOutputStream.java:106)
>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:585)
> And in the datanode logs:
>
> 2007-12-06 14:42:20,831 ERROR org.apache.hadoop.dfs.DataNode:
>  DataXceiver: java.io.IOException: Block
> blk_-8176614602638949879 is valid, and  cannot be written to.
>     at
> org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:515)
>     at
>  
> org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode
.java:822)

>     at
>  org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:727)
>     at java.lang.Thread.run(Thread.java:595)
>
> Also, can you please provide more details about your application?
>   I.e.,
> what is your inputformat, map function, etc.
>
> Very simple stuff, projecting certain fields as key and
> sorting. The  input is gzipped files in which each line has
> some fields separated by  a  delimiter.
>
> Doug
>
>
>
>
>
>
>    
>  
> ______________________________________________________________
> ______________________
> Never miss a thing.  Make Yahoo your home page.
> http://www.yahoo.com/r/hs
>
>
>
>
>
>      
> ______________________________________________________________
> ______________________
> Never miss a thing.  Make Yahoo your home page.
> http://www.yahoo.com/r/hs
>

Reply | Threaded
Open this post in threaded view
|

Re: Mapper Out of Memory

Rui Shi
In reply to this post by Rui Shi
Hi,

I didn't change those numbers. Basically, using the defaults.

Thanks,

Rui

----- Original Message ----
From: Devaraj Das <[hidden email]>
To: [hidden email]
Sent: Monday, December 10, 2007 4:48:59 AM
Subject: RE: Mapper Out of Memory


Was the value of mapred.child.java.opts set to something like 512MB ?
 What's
the io.sort.mb set to?

> -----Original Message-----
> From: Rui Shi [mailto:[hidden email]]
> Sent: Sunday, December 09, 2007 6:02 AM
> To: [hidden email]
> Subject: Re: Mapper Out of Memory
>
> Hi,
>
> I did some experiments on a single Linux machine. I generated
> some data using the 'random writer' and use the 'sort' in the
> hadoop-examples to sort them. I still got some out of memory
> exceptions as follows:
>
> java.lang.OutOfMemoryError: Java heap space
>     at java.util.Arrays.copyOf(Unknown Source)
>     at java.io.ByteArrayOutputStream.write(Unknown Source)
>     at java.io.DataOutputStream.write(Unknown Source)
>     at
> org.apache.hadoop.io.BytesWritable.write(BytesWritable.java:137)
>     at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTa
sk.java:340)
>     at
> org.apache.hadoop.mapred.lib.IdentityMapper.map(IdentityMapper
> .java:39)
>     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
>     at
>
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1777)

> Any ideas?
>
> Thanks,
>
> Rui
>
> ----- Original Message ----
> From: Rui Shi <[hidden email]>
> To: [hidden email]
> Sent: Thursday, December 6, 2007 5:56:42 PM
> Subject: Re: Mapper Out of Memory
>
>
> Hi,
>
> Out-of-memory exceptions can also be caused by having too
> many files open at once.  What does 'ulimit -n' show?
>
> 29491
>
> You presented an excerpt from a jobtracker log, right?  What
> do the tasktracker logs show?
>
> I saw the some warning in the tasktracker log:
>
> 2007-12-06 12:23:41,604 WARN org.apache.hadoop.ipc.Server:
> IPC Server  handler 0 on 50050, call
> progress(task_200712031900_0014_m_000058_0,
>  9.126612E-12, hdfs:///usr/ruish/400.gz:0+9528361, MAP,
>  org.apache.hadoop.mapred.Counters@11c135c) from: output
> error java.nio.channels.ClosedChannelException
>     at
>  
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl
> .java:125)
>     at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:294)
>     at
>  
> org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer(So
> cketChannelOutputStream.java:108)
>     at
>  
> org.apache.hadoop.ipc.SocketChannelOutputStream.write(SocketCh
> annelOutputStream.java:89)
>     at
>  
>
 java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)

>     at
>  java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>     at java.io.DataOutputStream.flush(DataOutputStream.java:106)
>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:585)
> And in the datanode logs:
>
> 2007-12-06 14:42:20,831 ERROR org.apache.hadoop.dfs.DataNode:
>  DataXceiver: java.io.IOException: Block
> blk_-8176614602638949879 is valid, and  cannot be written to.
>     at
> org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:515)
>     at
>  
> org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode
.java:822)

>     at
>  org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:727)
>     at java.lang.Thread.run(Thread.java:595)
>
> Also, can you please provide more details about your application?
>   I.e.,
> what is your inputformat, map function, etc.
>
> Very simple stuff, projecting certain fields as key and
> sorting. The  input is gzipped files in which each line has
> some fields separated by  a  delimiter.
>
> Doug
>
>
>
>
>
>
>    
>  
> ______________________________________________________________
> ______________________
> Never miss a thing.  Make Yahoo your home page.
> http://www.yahoo.com/r/hs
>
>
>
>
>
>      
> ______________________________________________________________
> ______________________
> Never miss a thing.  Make Yahoo your home page.
> http://www.yahoo.com/r/hs
>







      ____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ 
Reply | Threaded
Open this post in threaded view
|

RE: Mapper Out of Memory

Devaraj Das
Rui, pls set the mapred.child.java.opts to 512m. That should take care of
the OOM problem.

> -----Original Message-----
> From: Rui Shi [mailto:[hidden email]]
> Sent: Tuesday, December 11, 2007 3:15 AM
> To: [hidden email]
> Subject: Re: Mapper Out of Memory
>
> Hi,
>
> I didn't change those numbers. Basically, using the defaults.
>
> Thanks,
>
> Rui
>
> ----- Original Message ----
> From: Devaraj Das <[hidden email]>
> To: [hidden email]
> Sent: Monday, December 10, 2007 4:48:59 AM
> Subject: RE: Mapper Out of Memory
>
>
> Was the value of mapred.child.java.opts set to something like 512MB ?
>  What's
> the io.sort.mb set to?
>
> > -----Original Message-----
> > From: Rui Shi [mailto:[hidden email]]
> > Sent: Sunday, December 09, 2007 6:02 AM
> > To: [hidden email]
> > Subject: Re: Mapper Out of Memory
> >
> > Hi,
> >
> > I did some experiments on a single Linux machine. I generated some
> > data using the 'random writer' and use the 'sort' in the
> > hadoop-examples to sort them. I still got some out of memory
> > exceptions as follows:
> >
> > java.lang.OutOfMemoryError: Java heap space
> >     at java.util.Arrays.copyOf(Unknown Source)
> >     at java.io.ByteArrayOutputStream.write(Unknown Source)
> >     at java.io.DataOutputStream.write(Unknown Source)
> >     at
> > org.apache.hadoop.io.BytesWritable.write(BytesWritable.java:137)
> >     at
> > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTa
> sk.java:340)
> >     at
> > org.apache.hadoop.mapred.lib.IdentityMapper.map(IdentityMapper
> > .java:39)
> >     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
> >     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
> >     at
> >
>  
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1777)
> > Any ideas?
> >
> > Thanks,
> >
> > Rui
> >
> > ----- Original Message ----
> > From: Rui Shi <[hidden email]>
> > To: [hidden email]
> > Sent: Thursday, December 6, 2007 5:56:42 PM
> > Subject: Re: Mapper Out of Memory
> >
> >
> > Hi,
> >
> > Out-of-memory exceptions can also be caused by having too
> many files
> > open at once.  What does 'ulimit -n' show?
> >
> > 29491
> >
> > You presented an excerpt from a jobtracker log, right?  What do the
> > tasktracker logs show?
> >
> > I saw the some warning in the tasktracker log:
> >
> > 2007-12-06 12:23:41,604 WARN org.apache.hadoop.ipc.Server:
> > IPC Server  handler 0 on 50050, call
> > progress(task_200712031900_0014_m_000058_0,
> >  9.126612E-12, hdfs:///usr/ruish/400.gz:0+9528361, MAP,
> >  org.apache.hadoop.mapred.Counters@11c135c) from: output error
> > java.nio.channels.ClosedChannelException
> >     at
> >  
> > sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl
> > .java:125)
> >     at
> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:294)
> >     at
> >  
> > org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer(So
> > cketChannelOutputStream.java:108)
> >     at
> >  
> > org.apache.hadoop.ipc.SocketChannelOutputStream.write(SocketCh
> > annelOutputStream.java:89)
> >     at
> >  
> >
>  
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> >     at
> >  java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> >     at java.io.DataOutputStream.flush(DataOutputStream.java:106)
> >     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:585)
> > And in the datanode logs:
> >
> > 2007-12-06 14:42:20,831 ERROR org.apache.hadoop.dfs.DataNode:
> >  DataXceiver: java.io.IOException: Block
> > blk_-8176614602638949879 is valid, and  cannot be written to.
> >     at
> > org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:515)
> >     at
> >  
> > org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode
> .java:822)
> >     at
> >  org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:727)
> >     at java.lang.Thread.run(Thread.java:595)
> >
> > Also, can you please provide more details about your application?
> >   I.e.,
> > what is your inputformat, map function, etc.
> >
> > Very simple stuff, projecting certain fields as key and
> sorting. The  
> > input is gzipped files in which each line has some fields
> separated by  
> > a  delimiter.
> >
> > Doug
> >
> >
> >
> >
> >
> >
> >    
> >  
> > ______________________________________________________________
> > ______________________
> > Never miss a thing.  Make Yahoo your home page.
> > http://www.yahoo.com/r/hs
> >
> >
> >
> >
> >
> >      
> > ______________________________________________________________
> > ______________________
> > Never miss a thing.  Make Yahoo your home page.
> > http://www.yahoo.com/r/hs
> >
>
>
>
>
>
>
>
>      
> ______________________________________________________________
> ______________________
> Be a better friend, newshound, and
> know-it-all with Yahoo! Mobile.  Try it now.  
> http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ 
>