Node Manager crashes with OutOfMemory error

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Node Manager crashes with OutOfMemory error

Rahul Chhiber

Hi All,

 

I am running a Hadoop cluster with following configuration :-

 

Master (Resource Manager) - 16GB RAM + 8 vCPU

Slave 1 (Node manager 1) - 8GB RAM + 4 vCPU

Slave 2 (Node manager 2) - 8GB RAM + 4 vCPU

 

Memory allocated for container use per slave  i.e. yarn.nodemanager.resource.memory-mb is 6144.

 

When I launch an application, container allocation and execution is successful, but after executing 1 or 2 jobs on the cluster, either one or both the node manager daemons crash with the following error in logs :-

 

“java.lang.OutOfMemoryError: Java heap space

        at java.util.Arrays.copyOf(Arrays.java:2367)

        at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)

        at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)

        at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)

        at java.lang.StringBuffer.append(StringBuffer.java:237)

        at org.apache.hadoop.util.Shell$1.run(Shell.java:511)

2016-07-22 06:54:54,326 INFO org.apache.hadoop.util.ExitUtil: Halt with status -1 Message: HaltException”

 

We have allocated 1 GB of heap space for each node manager daemon. On average there are about 3 containers running on 1 slave node. We have been running Hadoop clusters for a while now, but haven’t faced this issue until recently. What are the memory sizing recommendations for Nodemanager ? As per my understanding, the memory used by containers or by the Application master should not have any bearing on Node manager memory consumption, as they all run in separate JVMs. What could be the possible reasons for high memory consumption for the Node Manager?

 

NOTE :- I tried allocating more heap memory for Node manager (2 GB), but issue still occurs intermittently. Containers getting killed due to excess memory consumption is understandable but if Node manager crashes in this manner it would be a serious scalability problem.

 

Thanks,

Rahul Chhiber

 

Reply | Threaded
Open this post in threaded view
|

Re: Node Manager crashes with OutOfMemory error

Ravi Prakash-3
Hi Rahul!

Which version of Hadoop are you using? What non-default values of configuration are you setting?
 
You can set HeapDumpOnOutOfMemoryError on the command line while starting up your nodemanagers and see the resulting heap dump in Eclipse MAT / jvisualvm / yourkit to see where are the memory is being used. There is likely some configuration that you may have set way beyond what you need. We regularly run NMs with 1000Mb and it works fine.

HTH
Ravi

On Mon, Jul 25, 2016 at 11:05 PM, Rahul Chhiber <[hidden email]> wrote:

Hi All,

 

I am running a Hadoop cluster with following configuration :-

 

Master (Resource Manager) - 16GB RAM + 8 vCPU

Slave 1 (Node manager 1) - 8GB RAM + 4 vCPU

Slave 2 (Node manager 2) - 8GB RAM + 4 vCPU

 

Memory allocated for container use per slave  i.e. yarn.nodemanager.resource.memory-mb is 6144.

 

When I launch an application, container allocation and execution is successful, but after executing 1 or 2 jobs on the cluster, either one or both the node manager daemons crash with the following error in logs :-

 

“java.lang.OutOfMemoryError: Java heap space

        at java.util.Arrays.copyOf(Arrays.java:2367)

        at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)

        at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)

        at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)

        at java.lang.StringBuffer.append(StringBuffer.java:237)

        at org.apache.hadoop.util.Shell$1.run(Shell.java:511)

2016-07-22 06:54:54,326 INFO org.apache.hadoop.util.ExitUtil: Halt with status -1 Message: HaltException”

 

We have allocated 1 GB of heap space for each node manager daemon. On average there are about 3 containers running on 1 slave node. We have been running Hadoop clusters for a while now, but haven’t faced this issue until recently. What are the memory sizing recommendations for Nodemanager ? As per my understanding, the memory used by containers or by the Application master should not have any bearing on Node manager memory consumption, as they all run in separate JVMs. What could be the possible reasons for high memory consumption for the Node Manager?

 

NOTE :- I tried allocating more heap memory for Node manager (2 GB), but issue still occurs intermittently. Containers getting killed due to excess memory consumption is understandable but if Node manager crashes in this manner it would be a serious scalability problem.

 

Thanks,

Rahul Chhiber