ACCEPTED: waiting for AM container to be allocated, launched and register with RM

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

ACCEPTED: waiting for AM container to be allocated, launched and register with RM

rammohan ganapavarapu
Hi,

When i submit a MR job, i am getting this from AM UI but it never get finished, what am i missing ?

Thanks,
Ram
Reply | Threaded
Open this post in threaded view
|

Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

Sunil Govind
Hi

It could be because of many of reasons. Also I am not sure about which scheduler your are using, pls share more details such as RM log etc.

I could point out few reasons
 - Such as "Not enough resource is cluster" can cause this
 - If using Capacity Scheduler, if queue capacity is maxed out, such case can happen.
 - Similarly if max-am-resource-percent is crossed per queue level, then also AM container may not be launched.

you could check RM log to get more information if AM container is laucnhed.

Thanks
Sunil

On Fri, Aug 19, 2016 at 5:37 AM rammohan ganapavarapu <[hidden email]> wrote:
Hi,

When i submit a MR job, i am getting this from AM UI but it never get finished, what am i missing ?

Thanks,
Ram
Reply | Threaded
Open this post in threaded view
|

Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

rammohan ganapavarapu
Sunil,

Thanks you for your input, below are my server metrics for RM. Also attached RM UI for capacity scheduler resources. How else i can find?

{
      "name": "Hadoop:service=ResourceManager,name=QueueMetrics,q0=root",
      "modelerType": "QueueMetrics,q0=root",
      "tag.Queue": "root",
      "tag.Context": "yarn",
      "tag.Hostname": "hadoop001",
      "running_0": 0,
      "running_60": 0,
      "running_300": 0,
      "running_1440": 0,
      "AppsSubmitted": 1,
      "AppsRunning": 0,
      "AppsPending": 0,
      "AppsCompleted": 0,
      "AppsKilled": 0,
      "AppsFailed": 1,
      "AllocatedMB": 0,
      "AllocatedVCores": 0,
      "AllocatedContainers": 0,
      "AggregateContainersAllocated": 2,
      "AggregateContainersReleased": 2,
      "AvailableMB": 64512,
      "AvailableVCores": 24,
      "PendingMB": 0,
      "PendingVCores": 0,
      "PendingContainers": 0,
      "ReservedMB": 0,
      "ReservedVCores": 0,
      "ReservedContainers": 0,
      "ActiveUsers": 0,
      "ActiveApplications": 0
    },

On Thu, Aug 18, 2016 at 6:49 PM, Sunil Govind <[hidden email]> wrote:
Hi

It could be because of many of reasons. Also I am not sure about which scheduler your are using, pls share more details such as RM log etc.

I could point out few reasons
 - Such as "Not enough resource is cluster" can cause this
 - If using Capacity Scheduler, if queue capacity is maxed out, such case can happen.
 - Similarly if max-am-resource-percent is crossed per queue level, then also AM container may not be launched.

you could check RM log to get more information if AM container is laucnhed.

Thanks
Sunil

On Fri, Aug 19, 2016 at 5:37 AM rammohan ganapavarapu <[hidden email]> wrote:
Hi,

When i submit a MR job, i am getting this from AM UI but it never get finished, what am i missing ?

Thanks,
Ram



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Screen Shot 2016-08-18 at 8.11.43 PM.png (210K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

tkg_cangkul
maybe u can check the logs from port 8088 on your browser. that was RM UI. just choose your job id and then check the logs.
 
On 19/08/16 10:14, rammohan ganapavarapu wrote:
Sunil,

Thanks you for your input, below are my server metrics for RM. Also attached RM UI for capacity scheduler resources. How else i can find?

{
      "name": "Hadoop:service=ResourceManager,name=QueueMetrics,q0=root",
      "modelerType": "QueueMetrics,q0=root",
      "tag.Queue": "root",
      "tag.Context": "yarn",
      "tag.Hostname": "hadoop001",
      "running_0": 0,
      "running_60": 0,
      "running_300": 0,
      "running_1440": 0,
      "AppsSubmitted": 1,
      "AppsRunning": 0,
      "AppsPending": 0,
      "AppsCompleted": 0,
      "AppsKilled": 0,
      "AppsFailed": 1,
      "AllocatedMB": 0,
      "AllocatedVCores": 0,
      "AllocatedContainers": 0,
      "AggregateContainersAllocated": 2,
      "AggregateContainersReleased": 2,
      "AvailableMB": 64512,
      "AvailableVCores": 24,
      "PendingMB": 0,
      "PendingVCores": 0,
      "PendingContainers": 0,
      "ReservedMB": 0,
      "ReservedVCores": 0,
      "ReservedContainers": 0,
      "ActiveUsers": 0,
      "ActiveApplications": 0
    },

On Thu, Aug 18, 2016 at 6:49 PM, Sunil Govind <[hidden email]> wrote:
Hi

It could be because of many of reasons. Also I am not sure about which scheduler your are using, pls share more details such as RM log etc.

I could point out few reasons
 - Such as "Not enough resource is cluster" can cause this
 - If using Capacity Scheduler, if queue capacity is maxed out, such case can happen.
 - Similarly if max-am-resource-percent is crossed per queue level, then also AM container may not be launched.

you could check RM log to get more information if AM container is laucnhed.

Thanks
Sunil

On Fri, Aug 19, 2016 at 5:37 AM rammohan ganapavarapu <[hidden email]> wrote:
Hi,

When i submit a MR job, i am getting this from AM UI but it never get finished, what am i missing ?

Thanks,
Ram



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

tkg_cangkul
i think that's because you don't have enough resource.  u can tune your cluster config to maximize your resource.

On 19/08/16 11:03, rammohan ganapavarapu wrote:
I dont see any thing odd except this not sure if i have to worry about it or not.

2016-08-19 03:29:26,621 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
2016-08-19 03:29:27,646 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCo
untWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-08-19 03:29:28,647 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)


its keep printing this log ..in app container logs.

On Thu, Aug 18, 2016 at 8:20 PM, tkg_cangkul <[hidden email]> wrote:
maybe u can check the logs from port 8088 on your browser. that was RM UI. just choose your job id and then check the logs.
 
On 19/08/16 10:14, rammohan ganapavarapu wrote:
Sunil,

Thanks you for your input, below are my server metrics for RM. Also attached RM UI for capacity scheduler resources. How else i can find?

{
      "name": "Hadoop:service=ResourceManager,name=QueueMetrics,q0=root",
      "modelerType": "QueueMetrics,q0=root",
      "tag.Queue": "root",
      "tag.Context": "yarn",
      "tag.Hostname": "hadoop001",
      "running_0": 0,
      "running_60": 0,
      "running_300": 0,
      "running_1440": 0,
      "AppsSubmitted": 1,
      "AppsRunning": 0,
      "AppsPending": 0,
      "AppsCompleted": 0,
      "AppsKilled": 0,
      "AppsFailed": 1,
      "AllocatedMB": 0,
      "AllocatedVCores": 0,
      "AllocatedContainers": 0,
      "AggregateContainersAllocated": 2,
      "AggregateContainersReleased": 2,
      "AvailableMB": 64512,
      "AvailableVCores": 24,
      "PendingMB": 0,
      "PendingVCores": 0,
      "PendingContainers": 0,
      "ReservedMB": 0,
      "ReservedVCores": 0,
      "ReservedContainers": 0,
      "ActiveUsers": 0,
      "ActiveApplications": 0
    },

On Thu, Aug 18, 2016 at 6:49 PM, Sunil Govind <[hidden email]> wrote:
Hi

It could be because of many of reasons. Also I am not sure about which scheduler your are using, pls share more details such as RM log etc.

I could point out few reasons
 - Such as "Not enough resource is cluster" can cause this
 - If using Capacity Scheduler, if queue capacity is maxed out, such case can happen.
 - Similarly if max-am-resource-percent is crossed per queue level, then also AM container may not be launched.

you could check RM log to get more information if AM container is laucnhed.

Thanks
Sunil

On Fri, Aug 19, 2016 at 5:37 AM rammohan ganapavarapu <[hidden email]> wrote:
Hi,

When i submit a MR job, i am getting this from AM UI but it never get finished, what am i missing ?

Thanks,
Ram



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



Reply | Threaded
Open this post in threaded view
|

Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

rammohan ganapavarapu
Do you know what properties to tune?

Thanks,
Ram

On Thu, Aug 18, 2016 at 9:11 PM, tkg_cangkul <[hidden email]> wrote:
i think that's because you don't have enough resource.  u can tune your cluster config to maximize your resource.


On 19/08/16 11:03, rammohan ganapavarapu wrote:
I dont see any thing odd except this not sure if i have to worry about it or not.

2016-08-19 03:29:26,621 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
2016-08-19 03:29:27,646 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCo
untWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-08-19 03:29:28,647 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)


its keep printing this log ..in app container logs.

On Thu, Aug 18, 2016 at 8:20 PM, tkg_cangkul <[hidden email]> wrote:
maybe u can check the logs from port 8088 on your browser. that was RM UI. just choose your job id and then check the logs.
 
On 19/08/16 10:14, rammohan ganapavarapu wrote:
Sunil,

Thanks you for your input, below are my server metrics for RM. Also attached RM UI for capacity scheduler resources. How else i can find?

{
      "name": "Hadoop:service=ResourceManager,name=QueueMetrics,q0=root",
      "modelerType": "QueueMetrics,q0=root",
      "tag.Queue": "root",
      "tag.Context": "yarn",
      "tag.Hostname": "hadoop001",
      "running_0": 0,
      "running_60": 0,
      "running_300": 0,
      "running_1440": 0,
      "AppsSubmitted": 1,
      "AppsRunning": 0,
      "AppsPending": 0,
      "AppsCompleted": 0,
      "AppsKilled": 0,
      "AppsFailed": 1,
      "AllocatedMB": 0,
      "AllocatedVCores": 0,
      "AllocatedContainers": 0,
      "AggregateContainersAllocated": 2,
      "AggregateContainersReleased": 2,
      "AvailableMB": 64512,
      "AvailableVCores": 24,
      "PendingMB": 0,
      "PendingVCores": 0,
      "PendingContainers": 0,
      "ReservedMB": 0,
      "ReservedVCores": 0,
      "ReservedContainers": 0,
      "ActiveUsers": 0,
      "ActiveApplications": 0
    },

On Thu, Aug 18, 2016 at 6:49 PM, Sunil Govind <[hidden email]> wrote:
Hi

It could be because of many of reasons. Also I am not sure about which scheduler your are using, pls share more details such as RM log etc.

I could point out few reasons
 - Such as "Not enough resource is cluster" can cause this
 - If using Capacity Scheduler, if queue capacity is maxed out, such case can happen.
 - Similarly if max-am-resource-percent is crossed per queue level, then also AM container may not be launched.

you could check RM log to get more information if AM container is laucnhed.

Thanks
Sunil

On Fri, Aug 19, 2016 at 5:37 AM rammohan ganapavarapu <[hidden email]> wrote:
Hi,

When i submit a MR job, i am getting this from AM UI but it never get finished, what am i missing ?

Thanks,
Ram



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]




Reply | Threaded
Open this post in threaded view
|

Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

rammohan ganapavarapu
When i submit a job using yarn its seems working only with oozie its failing i guess, not sure what is missing.

yarn jar /uap/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar pi 20 1000
Number of Maps  = 20
Samples per Map = 1000
.
.
.
Job Finished in 19.622 seconds
Estimated value of Pi is 3.14280000000000000000

Ram

On Fri, Aug 19, 2016 at 11:46 AM, rammohan ganapavarapu <[hidden email]> wrote:
Ok, i have used yarn-utils.py to get the correct values for my cluster and update those properties and restarted RM and NM but still no luck not sure what i am missing, any other insights will help me.

Below are my properties from yarn-site.xml and map-site.xml.

python yarn-utils.py -c 24 -m 63 -d 3 -k False
 Using cores=24 memory=63GB disks=3 hbase=False
 Profile: cores=24 memory=63488MB reserved=1GB usableMem=62GB disks=3
 Num Container=6
 Container Ram=10240MB
 Used Ram=60GB
 Unused Ram=1GB
 yarn.scheduler.minimum-allocation-mb=10240
 yarn.scheduler.maximum-allocation-mb=61440
 yarn.nodemanager.resource.memory-mb=61440
 mapreduce.map.memory.mb=5120
 mapreduce.map.java.opts=-Xmx4096m
 mapreduce.reduce.memory.mb=10240
 mapreduce.reduce.java.opts=-Xmx8192m
 yarn.app.mapreduce.am.resource.mb=5120
 yarn.app.mapreduce.am.command-opts=-Xmx4096m
 mapreduce.task.io.sort.mb=1024


    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>10240</value>
    </property>
    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx8192m</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.resource.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.command-opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.task.io.sort.mb</name>
      <value>1024</value>
    </property>



     <property>
      <name>yarn.scheduler.minimum-allocation-mb</name>
      <value>10240</value>
    </property>

     <property>
      <name>yarn.scheduler.maximum-allocation-mb</name>
      <value>61440</value>
    </property>

     <property>
      <name>yarn.nodemanager.resource.memory-mb</name>
      <value>61440</value>
    </property>


Ram

On Thu, Aug 18, 2016 at 11:14 PM, tkg_cangkul <[hidden email]> wrote:
maybe this link can be some reference to tune up the cluster:

http://jason4zhu.blogspot.co.id/2014/10/memory-configuration-in-hadoop.html


On 19/08/16 11:13, rammohan ganapavarapu wrote:
Do you know what properties to tune?

Thanks,
Ram

On Thu, Aug 18, 2016 at 9:11 PM, tkg_cangkul <[hidden email]> wrote:
i think that's because you don't have enough resource.  u can tune your cluster config to maximize your resource.


On 19/08/16 11:03, rammohan ganapavarapu wrote:
I dont see any thing odd except this not sure if i have to worry about it or not.

2016-08-19 03:29:26,621 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
2016-08-19 03:29:27,646 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCo
untWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-08-19 03:29:28,647 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)


its keep printing this log ..in app container logs.

On Thu, Aug 18, 2016 at 8:20 PM, tkg_cangkul <[hidden email]> wrote:
maybe u can check the logs from port 8088 on your browser. that was RM UI. just choose your job id and then check the logs.
 
On 19/08/16 10:14, rammohan ganapavarapu wrote:
Sunil,

Thanks you for your input, below are my server metrics for RM. Also attached RM UI for capacity scheduler resources. How else i can find?

{
      "name": "Hadoop:service=ResourceManager,name=QueueMetrics,q0=root",
      "modelerType": "QueueMetrics,q0=root",
      "tag.Queue": "root",
      "tag.Context": "yarn",
      "tag.Hostname": "hadoop001",
      "running_0": 0,
      "running_60": 0,
      "running_300": 0,
      "running_1440": 0,
      "AppsSubmitted": 1,
      "AppsRunning": 0,
      "AppsPending": 0,
      "AppsCompleted": 0,
      "AppsKilled": 0,
      "AppsFailed": 1,
      "AllocatedMB": 0,
      "AllocatedVCores": 0,
      "AllocatedContainers": 0,
      "AggregateContainersAllocated": 2,
      "AggregateContainersReleased": 2,
      "AvailableMB": 64512,
      "AvailableVCores": 24,
      "PendingMB": 0,
      "PendingVCores": 0,
      "PendingContainers": 0,
      "ReservedMB": 0,
      "ReservedVCores": 0,
      "ReservedContainers": 0,
      "ActiveUsers": 0,
      "ActiveApplications": 0
    },

On Thu, Aug 18, 2016 at 6:49 PM, Sunil Govind <[hidden email]> wrote:
Hi

It could be because of many of reasons. Also I am not sure about which scheduler your are using, pls share more details such as RM log etc.

I could point out few reasons
 - Such as "Not enough resource is cluster" can cause this
 - If using Capacity Scheduler, if queue capacity is maxed out, such case can happen.
 - Similarly if max-am-resource-percent is crossed per queue level, then also AM container may not be launched.

you could check RM log to get more information if AM container is laucnhed.

Thanks
Sunil

On Fri, Aug 19, 2016 at 5:37 AM rammohan ganapavarapu <[hidden email]> wrote:
Hi,

When i submit a MR job, i am getting this from AM UI but it never get finished, what am i missing ?

Thanks,
Ram



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]







Reply | Threaded
Open this post in threaded view
|

Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

rammohan ganapavarapu
Even if  the cluster dont have enough resources it should connect to "
/0.0.0.0:8030" right? it should connect to my <RM_HOST:8030>, not sure why its trying to connect to 0.0.0.0:8030.
I have verified the config and i removed traces of 0.0.0.0 still no luck.
org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030

If an one has any clue please share.

Thanks,
Ram


On Fri, Aug 19, 2016 at 2:32 PM, rammohan ganapavarapu <[hidden email]> wrote:
When i submit a job using yarn its seems working only with oozie its failing i guess, not sure what is missing.

yarn jar /uap/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar pi 20 1000
Number of Maps  = 20
Samples per Map = 1000
.
.
.
Job Finished in 19.622 seconds
Estimated value of Pi is 3.14280000000000000000

Ram

On Fri, Aug 19, 2016 at 11:46 AM, rammohan ganapavarapu <[hidden email]> wrote:
Ok, i have used yarn-utils.py to get the correct values for my cluster and update those properties and restarted RM and NM but still no luck not sure what i am missing, any other insights will help me.

Below are my properties from yarn-site.xml and map-site.xml.

python yarn-utils.py -c 24 -m 63 -d 3 -k False
 Using cores=24 memory=63GB disks=3 hbase=False
 Profile: cores=24 memory=63488MB reserved=1GB usableMem=62GB disks=3
 Num Container=6
 Container Ram=10240MB
 Used Ram=60GB
 Unused Ram=1GB
 yarn.scheduler.minimum-allocation-mb=10240
 yarn.scheduler.maximum-allocation-mb=61440
 yarn.nodemanager.resource.memory-mb=61440
 mapreduce.map.memory.mb=5120
 mapreduce.map.java.opts=-Xmx4096m
 mapreduce.reduce.memory.mb=10240
 mapreduce.reduce.java.opts=-Xmx8192m
 yarn.app.mapreduce.am.resource.mb=5120
 yarn.app.mapreduce.am.command-opts=-Xmx4096m
 mapreduce.task.io.sort.mb=1024


    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>10240</value>
    </property>
    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx8192m</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.resource.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.command-opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.task.io.sort.mb</name>
      <value>1024</value>
    </property>



     <property>
      <name>yarn.scheduler.minimum-allocation-mb</name>
      <value>10240</value>
    </property>

     <property>
      <name>yarn.scheduler.maximum-allocation-mb</name>
      <value>61440</value>
    </property>

     <property>
      <name>yarn.nodemanager.resource.memory-mb</name>
      <value>61440</value>
    </property>


Ram

On Thu, Aug 18, 2016 at 11:14 PM, tkg_cangkul <[hidden email]> wrote:
maybe this link can be some reference to tune up the cluster:

http://jason4zhu.blogspot.co.id/2014/10/memory-configuration-in-hadoop.html


On 19/08/16 11:13, rammohan ganapavarapu wrote:
Do you know what properties to tune?

Thanks,
Ram

On Thu, Aug 18, 2016 at 9:11 PM, tkg_cangkul <[hidden email]> wrote:
i think that's because you don't have enough resource.  u can tune your cluster config to maximize your resource.


On 19/08/16 11:03, rammohan ganapavarapu wrote:
I dont see any thing odd except this not sure if i have to worry about it or not.

2016-08-19 03:29:26,621 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
2016-08-19 03:29:27,646 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCo
untWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-08-19 03:29:28,647 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)


its keep printing this log ..in app container logs.

On Thu, Aug 18, 2016 at 8:20 PM, tkg_cangkul <[hidden email]> wrote:
maybe u can check the logs from port 8088 on your browser. that was RM UI. just choose your job id and then check the logs.
 
On 19/08/16 10:14, rammohan ganapavarapu wrote:
Sunil,

Thanks you for your input, below are my server metrics for RM. Also attached RM UI for capacity scheduler resources. How else i can find?

{
      "name": "Hadoop:service=ResourceManager,name=QueueMetrics,q0=root",
      "modelerType": "QueueMetrics,q0=root",
      "tag.Queue": "root",
      "tag.Context": "yarn",
      "tag.Hostname": "hadoop001",
      "running_0": 0,
      "running_60": 0,
      "running_300": 0,
      "running_1440": 0,
      "AppsSubmitted": 1,
      "AppsRunning": 0,
      "AppsPending": 0,
      "AppsCompleted": 0,
      "AppsKilled": 0,
      "AppsFailed": 1,
      "AllocatedMB": 0,
      "AllocatedVCores": 0,
      "AllocatedContainers": 0,
      "AggregateContainersAllocated": 2,
      "AggregateContainersReleased": 2,
      "AvailableMB": 64512,
      "AvailableVCores": 24,
      "PendingMB": 0,
      "PendingVCores": 0,
      "PendingContainers": 0,
      "ReservedMB": 0,
      "ReservedVCores": 0,
      "ReservedContainers": 0,
      "ActiveUsers": 0,
      "ActiveApplications": 0
    },

On Thu, Aug 18, 2016 at 6:49 PM, Sunil Govind <[hidden email]> wrote:
Hi

It could be because of many of reasons. Also I am not sure about which scheduler your are using, pls share more details such as RM log etc.

I could point out few reasons
 - Such as "Not enough resource is cluster" can cause this
 - If using Capacity Scheduler, if queue capacity is maxed out, such case can happen.
 - Similarly if max-am-resource-percent is crossed per queue level, then also AM container may not be launched.

you could check RM log to get more information if AM container is laucnhed.

Thanks
Sunil

On Fri, Aug 19, 2016 at 5:37 AM rammohan ganapavarapu <[hidden email]> wrote:
Hi,

When i submit a MR job, i am getting this from AM UI but it never get finished, what am i missing ?

Thanks,
Ram



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]








Reply | Threaded
Open this post in threaded view
|

Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

Rohith Sharma K S-3
Hi

From below discussion and AM logs, I see that AM container has launched but not able to connect to RM.

This looks like your configuration issue. Would you check your job.xml jar that does yarn.resourcemanager.scheduler.address has been configured? 

Essentially, this address required by MRAppMaster for connecting to RM for heartbeats. If you don’t not configure, default value will be taken i.e 8030.


Thanks & Regards
Rohith Sharma K S

On Aug 20, 2016, at 7:02 AM, rammohan ganapavarapu <[hidden email]> wrote:

Even if  the cluster dont have enough resources it should connect to "
/0.0.0.0:8030" right? it should connect to my <RM_HOST:8030>, not sure why its trying to connect to 0.0.0.0:8030.
I have verified the config and i removed traces of 0.0.0.0 still no luck.
org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030

If an one has any clue please share.

Thanks,
Ram


On Fri, Aug 19, 2016 at 2:32 PM, rammohan ganapavarapu <[hidden email]> wrote:
When i submit a job using yarn its seems working only with oozie its failing i guess, not sure what is missing.

yarn jar /uap/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar pi 20 1000
Number of Maps  = 20
Samples per Map = 1000
.
.
.
Job Finished in 19.622 seconds
Estimated value of Pi is 3.14280000000000000000

Ram

On Fri, Aug 19, 2016 at 11:46 AM, rammohan ganapavarapu <[hidden email]> wrote:
Ok, i have used yarn-utils.py to get the correct values for my cluster and update those properties and restarted RM and NM but still no luck not sure what i am missing, any other insights will help me.

Below are my properties from yarn-site.xml and map-site.xml.

python yarn-utils.py -c 24 -m 63 -d 3 -k False
 Using cores=24 memory=63GB disks=3 hbase=False
 Profile: cores=24 memory=63488MB reserved=1GB usableMem=62GB disks=3
 Num Container=6
 Container Ram=10240MB
 Used Ram=60GB
 Unused Ram=1GB
 yarn.scheduler.minimum-allocation-mb=10240
 yarn.scheduler.maximum-allocation-mb=61440
 yarn.nodemanager.resource.memory-mb=61440
 mapreduce.map.memory.mb=5120
 mapreduce.map.java.opts=-Xmx4096m
 mapreduce.reduce.memory.mb=10240
 mapreduce.reduce.java.opts=-Xmx8192m
 yarn.app.mapreduce.am.resource.mb=5120
 yarn.app.mapreduce.am.command-opts=-Xmx4096m
 mapreduce.task.io.sort.mb=1024


    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>10240</value>
    </property>
    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx8192m</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.resource.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.command-opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.task.io.sort.mb</name>
      <value>1024</value>
    </property>



     <property>
      <name>yarn.scheduler.minimum-allocation-mb</name>
      <value>10240</value>
    </property>

     <property>
      <name>yarn.scheduler.maximum-allocation-mb</name>
      <value>61440</value>
    </property>

     <property>
      <name>yarn.nodemanager.resource.memory-mb</name>
      <value>61440</value>
    </property>


Ram

On Thu, Aug 18, 2016 at 11:14 PM, tkg_cangkul <[hidden email]> wrote:
maybe this link can be some reference to tune up the cluster:

http://jason4zhu.blogspot.co.id/2014/10/memory-configuration-in-hadoop.html


On 19/08/16 11:13, rammohan ganapavarapu wrote:
Do you know what properties to tune?

Thanks,
Ram

On Thu, Aug 18, 2016 at 9:11 PM, tkg_cangkul <[hidden email]> wrote:
i think that's because you don't have enough resource.  u can tune your cluster config to maximize your resource.


On 19/08/16 11:03, rammohan ganapavarapu wrote:
I dont see any thing odd except this not sure if i have to worry about it or not.

2016-08-19 03:29:26,621 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
2016-08-19 03:29:27,646 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCo
untWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-08-19 03:29:28,647 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)


its keep printing this log ..in app container logs.

On Thu, Aug 18, 2016 at 8:20 PM, tkg_cangkul <[hidden email]> wrote:
maybe u can check the logs from port 8088 on your browser. that was RM UI. just choose your job id and then check the logs.
 
On 19/08/16 10:14, rammohan ganapavarapu wrote:
Sunil,

Thanks you for your input, below are my server metrics for RM. Also attached RM UI for capacity scheduler resources. How else i can find?

{
      "name": "Hadoop:service=ResourceManager,name=QueueMetrics,q0=root",
      "modelerType": "QueueMetrics,q0=root",
      "tag.Queue": "root",
      "tag.Context": "yarn",
      "tag.Hostname": "hadoop001",
      "running_0": 0,
      "running_60": 0,
      "running_300": 0,
      "running_1440": 0,
      "AppsSubmitted": 1,
      "AppsRunning": 0,
      "AppsPending": 0,
      "AppsCompleted": 0,
      "AppsKilled": 0,
      "AppsFailed": 1,
      "AllocatedMB": 0,
      "AllocatedVCores": 0,
      "AllocatedContainers": 0,
      "AggregateContainersAllocated": 2,
      "AggregateContainersReleased": 2,
      "AvailableMB": 64512,
      "AvailableVCores": 24,
      "PendingMB": 0,
      "PendingVCores": 0,
      "PendingContainers": 0,
      "ReservedMB": 0,
      "ReservedVCores": 0,
      "ReservedContainers": 0,
      "ActiveUsers": 0,
      "ActiveApplications": 0
    },

On Thu, Aug 18, 2016 at 6:49 PM, Sunil Govind <[hidden email]> wrote:
Hi

It could be because of many of reasons. Also I am not sure about which scheduler your are using, pls share more details such as RM log etc.

I could point out few reasons
 - Such as "Not enough resource is cluster" can cause this
 - If using Capacity Scheduler, if queue capacity is maxed out, such case can happen.
 - Similarly if max-am-resource-percent is crossed per queue level, then also AM container may not be launched.

you could check RM log to get more information if AM container is laucnhed.

Thanks
Sunil

On Fri, Aug 19, 2016 at 5:37 AM rammohan ganapavarapu <[hidden email]> wrote:
Hi,

When i submit a MR job, i am getting this from AM UI but it never get finished, what am i missing ?

Thanks,
Ram



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]









Reply | Threaded
Open this post in threaded view
|

Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

rammohan ganapavarapu

yes, I did configured.


On Aug 19, 2016 7:22 PM, "Rohith Sharma K S" <[hidden email]> wrote:
Hi

From below discussion and AM logs, I see that AM container has launched but not able to connect to RM.

This looks like your configuration issue. Would you check your job.xml jar that does yarn.resourcemanager.scheduler.address has been configured? 

Essentially, this address required by MRAppMaster for connecting to RM for heartbeats. If you don’t not configure, default value will be taken i.e 8030.


Thanks & Regards
Rohith Sharma K S

On Aug 20, 2016, at 7:02 AM, rammohan ganapavarapu <[hidden email]> wrote:

Even if  the cluster dont have enough resources it should connect to "
/0.0.0.0:8030" right? it should connect to my <RM_HOST:8030>, not sure why its trying to connect to 0.0.0.0:8030.
I have verified the config and i removed traces of 0.0.0.0 still no luck.
org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030

If an one has any clue please share.

Thanks,
Ram


On Fri, Aug 19, 2016 at 2:32 PM, rammohan ganapavarapu <[hidden email]> wrote:
When i submit a job using yarn its seems working only with oozie its failing i guess, not sure what is missing.

yarn jar /uap/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar pi 20 1000
Number of Maps  = 20
Samples per Map = 1000
.
.
.
Job Finished in 19.622 seconds
Estimated value of Pi is 3.14280000000000000000

Ram

On Fri, Aug 19, 2016 at 11:46 AM, rammohan ganapavarapu <[hidden email]> wrote:
Ok, i have used yarn-utils.py to get the correct values for my cluster and update those properties and restarted RM and NM but still no luck not sure what i am missing, any other insights will help me.

Below are my properties from yarn-site.xml and map-site.xml.

python yarn-utils.py -c 24 -m 63 -d 3 -k False
 Using cores=24 memory=63GB disks=3 hbase=False
 Profile: cores=24 memory=63488MB reserved=1GB usableMem=62GB disks=3
 Num Container=6
 Container Ram=10240MB
 Used Ram=60GB
 Unused Ram=1GB
 yarn.scheduler.minimum-allocation-mb=10240
 yarn.scheduler.maximum-allocation-mb=61440
 yarn.nodemanager.resource.memory-mb=61440
 mapreduce.map.memory.mb=5120
 mapreduce.map.java.opts=-Xmx4096m
 mapreduce.reduce.memory.mb=10240
 mapreduce.reduce.java.opts=-Xmx8192m
 yarn.app.mapreduce.am.resource.mb=5120
 yarn.app.mapreduce.am.command-opts=-Xmx4096m
 mapreduce.task.io.sort.mb=1024


    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>10240</value>
    </property>
    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx8192m</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.resource.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.command-opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.task.io.sort.mb</name>
      <value>1024</value>
    </property>



     <property>
      <name>yarn.scheduler.minimum-allocation-mb</name>
      <value>10240</value>
    </property>

     <property>
      <name>yarn.scheduler.maximum-allocation-mb</name>
      <value>61440</value>
    </property>

     <property>
      <name>yarn.nodemanager.resource.memory-mb</name>
      <value>61440</value>
    </property>


Ram

On Thu, Aug 18, 2016 at 11:14 PM, tkg_cangkul <[hidden email]> wrote:
maybe this link can be some reference to tune up the cluster:

http://jason4zhu.blogspot.co.id/2014/10/memory-configuration-in-hadoop.html


On 19/08/16 11:13, rammohan ganapavarapu wrote:
Do you know what properties to tune?

Thanks,
Ram

On Thu, Aug 18, 2016 at 9:11 PM, tkg_cangkul <[hidden email]> wrote:
i think that's because you don't have enough resource.  u can tune your cluster config to maximize your resource.


On 19/08/16 11:03, rammohan ganapavarapu wrote:
I dont see any thing odd except this not sure if i have to worry about it or not.

2016-08-19 03:29:26,621 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
2016-08-19 03:29:27,646 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCo
untWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-08-19 03:29:28,647 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)


its keep printing this log ..in app container logs.

On Thu, Aug 18, 2016 at 8:20 PM, tkg_cangkul <[hidden email]> wrote:
maybe u can check the logs from port 8088 on your browser. that was RM UI. just choose your job id and then check the logs.
 
On 19/08/16 10:14, rammohan ganapavarapu wrote:
Sunil,

Thanks you for your input, below are my server metrics for RM. Also attached RM UI for capacity scheduler resources. How else i can find?

{
      "name": "Hadoop:service=ResourceManager,name=QueueMetrics,q0=root",
      "modelerType": "QueueMetrics,q0=root",
      "tag.Queue": "root",
      "tag.Context": "yarn",
      "tag.Hostname": "hadoop001",
      "running_0": 0,
      "running_60": 0,
      "running_300": 0,
      "running_1440": 0,
      "AppsSubmitted": 1,
      "AppsRunning": 0,
      "AppsPending": 0,
      "AppsCompleted": 0,
      "AppsKilled": 0,
      "AppsFailed": 1,
      "AllocatedMB": 0,
      "AllocatedVCores": 0,
      "AllocatedContainers": 0,
      "AggregateContainersAllocated": 2,
      "AggregateContainersReleased": 2,
      "AvailableMB": 64512,
      "AvailableVCores": 24,
      "PendingMB": 0,
      "PendingVCores": 0,
      "PendingContainers": 0,
      "ReservedMB": 0,
      "ReservedVCores": 0,
      "ReservedContainers": 0,
      "ActiveUsers": 0,
      "ActiveApplications": 0
    },

On Thu, Aug 18, 2016 at 6:49 PM, Sunil Govind <[hidden email]> wrote:
Hi

It could be because of many of reasons. Also I am not sure about which scheduler your are using, pls share more details such as RM log etc.

I could point out few reasons
 - Such as "Not enough resource is cluster" can cause this
 - If using Capacity Scheduler, if queue capacity is maxed out, such case can happen.
 - Similarly if max-am-resource-percent is crossed per queue level, then also AM container may not be launched.

you could check RM log to get more information if AM container is laucnhed.

Thanks
Sunil

On Fri, Aug 19, 2016 at 5:37 AM rammohan ganapavarapu <[hidden email]> wrote:
Hi,

When i submit a MR job, i am getting this from AM UI but it never get finished, what am i missing ?

Thanks,
Ram



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]









Reply | Threaded
Open this post in threaded view
|

Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

Sunil Govind
Hi Ram

From the console log, as Rohith said, AM is looking for AM at 8030. So pls confirm the RM port once.
Could you please share AM and RM logs.

Thanks 
Sunil

On Sat, Aug 20, 2016 at 10:36 AM rammohan ganapavarapu <[hidden email]> wrote:

yes, I did configured.


On Aug 19, 2016 7:22 PM, "Rohith Sharma K S" <[hidden email]> wrote:
Hi

From below discussion and AM logs, I see that AM container has launched but not able to connect to RM.

This looks like your configuration issue. Would you check your job.xml jar that does yarn.resourcemanager.scheduler.address has been configured? 

Essentially, this address required by MRAppMaster for connecting to RM for heartbeats. If you don’t not configure, default value will be taken i.e 8030.


Thanks & Regards
Rohith Sharma K S

On Aug 20, 2016, at 7:02 AM, rammohan ganapavarapu <[hidden email]> wrote:

Even if  the cluster dont have enough resources it should connect to "
/0.0.0.0:8030" right? it should connect to my <RM_HOST:8030>, not sure why its trying to connect to 0.0.0.0:8030.
I have verified the config and i removed traces of 0.0.0.0 still no luck.
org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030

If an one has any clue please share.

Thanks,
Ram


On Fri, Aug 19, 2016 at 2:32 PM, rammohan ganapavarapu <[hidden email]> wrote:
When i submit a job using yarn its seems working only with oozie its failing i guess, not sure what is missing.

yarn jar /uap/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar pi 20 1000
Number of Maps  = 20
Samples per Map = 1000
.
.
.
Job Finished in 19.622 seconds
Estimated value of Pi is 3.14280000000000000000

Ram

On Fri, Aug 19, 2016 at 11:46 AM, rammohan ganapavarapu <[hidden email]> wrote:
Ok, i have used yarn-utils.py to get the correct values for my cluster and update those properties and restarted RM and NM but still no luck not sure what i am missing, any other insights will help me.

Below are my properties from yarn-site.xml and map-site.xml.

python yarn-utils.py -c 24 -m 63 -d 3 -k False
 Using cores=24 memory=63GB disks=3 hbase=False
 Profile: cores=24 memory=63488MB reserved=1GB usableMem=62GB disks=3
 Num Container=6
 Container Ram=10240MB
 Used Ram=60GB
 Unused Ram=1GB
 yarn.scheduler.minimum-allocation-mb=10240
 yarn.scheduler.maximum-allocation-mb=61440
 yarn.nodemanager.resource.memory-mb=61440
 mapreduce.map.memory.mb=5120
 mapreduce.map.java.opts=-Xmx4096m
 mapreduce.reduce.memory.mb=10240
 mapreduce.reduce.java.opts=-Xmx8192m
 yarn.app.mapreduce.am.resource.mb=5120
 yarn.app.mapreduce.am.command-opts=-Xmx4096m
 mapreduce.task.io.sort.mb=1024


    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>10240</value>
    </property>
    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx8192m</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.resource.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.command-opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.task.io.sort.mb</name>
      <value>1024</value>
    </property>



     <property>
      <name>yarn.scheduler.minimum-allocation-mb</name>
      <value>10240</value>
    </property>

     <property>
      <name>yarn.scheduler.maximum-allocation-mb</name>
      <value>61440</value>
    </property>

     <property>
      <name>yarn.nodemanager.resource.memory-mb</name>
      <value>61440</value>
    </property>


Ram

On Thu, Aug 18, 2016 at 11:14 PM, tkg_cangkul <[hidden email]> wrote:
maybe this link can be some reference to tune up the cluster:

http://jason4zhu.blogspot.co.id/2014/10/memory-configuration-in-hadoop.html


On 19/08/16 11:13, rammohan ganapavarapu wrote:
Do you know what properties to tune?

Thanks,
Ram

On Thu, Aug 18, 2016 at 9:11 PM, tkg_cangkul <[hidden email]> wrote:
i think that's because you don't have enough resource.  u can tune your cluster config to maximize your resource.


On 19/08/16 11:03, rammohan ganapavarapu wrote:
I dont see any thing odd except this not sure if i have to worry about it or not.

2016-08-19 03:29:26,621 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
2016-08-19 03:29:27,646 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCo
untWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-08-19 03:29:28,647 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)


its keep printing this log ..in app container logs.

On Thu, Aug 18, 2016 at 8:20 PM, tkg_cangkul <[hidden email]> wrote:
maybe u can check the logs from port 8088 on your browser. that was RM UI. just choose your job id and then check the logs.
 
On 19/08/16 10:14, rammohan ganapavarapu wrote:
Sunil,

Thanks you for your input, below are my server metrics for RM. Also attached RM UI for capacity scheduler resources. How else i can find?

{
      "name": "Hadoop:service=ResourceManager,name=QueueMetrics,q0=root",
      "modelerType": "QueueMetrics,q0=root",
      "tag.Queue": "root",
      "tag.Context": "yarn",
      "tag.Hostname": "hadoop001",
      "running_0": 0,
      "running_60": 0,
      "running_300": 0,
      "running_1440": 0,
      "AppsSubmitted": 1,
      "AppsRunning": 0,
      "AppsPending": 0,
      "AppsCompleted": 0,
      "AppsKilled": 0,
      "AppsFailed": 1,
      "AllocatedMB": 0,
      "AllocatedVCores": 0,
      "AllocatedContainers": 0,
      "AggregateContainersAllocated": 2,
      "AggregateContainersReleased": 2,
      "AvailableMB": 64512,
      "AvailableVCores": 24,
      "PendingMB": 0,
      "PendingVCores": 0,
      "PendingContainers": 0,
      "ReservedMB": 0,
      "ReservedVCores": 0,
      "ReservedContainers": 0,
      "ActiveUsers": 0,
      "ActiveApplications": 0
    },

On Thu, Aug 18, 2016 at 6:49 PM, Sunil Govind <[hidden email]> wrote:
Hi

It could be because of many of reasons. Also I am not sure about which scheduler your are using, pls share more details such as RM log etc.

I could point out few reasons
 - Such as "Not enough resource is cluster" can cause this
 - If using Capacity Scheduler, if queue capacity is maxed out, such case can happen.
 - Similarly if max-am-resource-percent is crossed per queue level, then also AM container may not be launched.

you could check RM log to get more information if AM container is laucnhed.

Thanks
Sunil

On Fri, Aug 19, 2016 at 5:37 AM rammohan ganapavarapu <[hidden email]> wrote:
Hi,

When i submit a MR job, i am getting this from AM UI but it never get finished, what am i missing ?

Thanks,
Ram



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]









Reply | Threaded
Open this post in threaded view
|

Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

Sunil Govind
Hi.

It seems its an oozie issue. From conf, RM scheduler is running at port 8030.
But your job.properties is taking 8032. I suggest you could double confirm your oozie configuration and see the configurations are intact to contact RM. Sharing a link also

Thanks
Sunil


On Sun, Aug 21, 2016 at 8:41 AM rammohan ganapavarapu <[hidden email]> wrote:
Please find the attached config that i got from yarn ui and  AM,RM logs. I only see that connecting to 0.0.0.0:8030 when i submit job using oozie, but if i submit as yarn jar its working fine as i posted in my previous posts.

Here is my oozie job.properties file, i have a java class that just prints

nameNode=hdfs://master01:8020
jobTracker=master01:8032
workflowName=EchoJavaJob
oozie.use.system.libpath=true

queueName=default
hdfsWorkflowHome=/user/uap/oozieWorkflows

workflowPath=${nameNode}${hdfsWorkflowHome}/${workflowName}
oozie.wf.application.path=${workflowPath}

Please let me know if you guys find any clue why its trying to connect to 0.0.0.:8030.

Thanks,
Ram


On Fri, Aug 19, 2016 at 11:54 PM, Sunil Govind <[hidden email]> wrote:
Hi Ram

From the console log, as Rohith said, AM is looking for AM at 8030. So pls confirm the RM port once.
Could you please share AM and RM logs.

Thanks 
Sunil

On Sat, Aug 20, 2016 at 10:36 AM rammohan ganapavarapu <[hidden email]> wrote:

yes, I did configured.


On Aug 19, 2016 7:22 PM, "Rohith Sharma K S" <[hidden email]> wrote:
Hi

From below discussion and AM logs, I see that AM container has launched but not able to connect to RM.

This looks like your configuration issue. Would you check your job.xml jar that does yarn.resourcemanager.scheduler.address has been configured? 

Essentially, this address required by MRAppMaster for connecting to RM for heartbeats. If you don’t not configure, default value will be taken i.e 8030.


Thanks & Regards
Rohith Sharma K S

On Aug 20, 2016, at 7:02 AM, rammohan ganapavarapu <[hidden email]> wrote:

Even if  the cluster dont have enough resources it should connect to "
/0.0.0.0:8030" right? it should connect to my <RM_HOST:8030>, not sure why its trying to connect to 0.0.0.0:8030.
I have verified the config and i removed traces of 0.0.0.0 still no luck.
org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030

If an one has any clue please share.

Thanks,
Ram


On Fri, Aug 19, 2016 at 2:32 PM, rammohan ganapavarapu <[hidden email]> wrote:
When i submit a job using yarn its seems working only with oozie its failing i guess, not sure what is missing.

yarn jar /uap/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar pi 20 1000
Number of Maps  = 20
Samples per Map = 1000
.
.
.
Job Finished in 19.622 seconds
Estimated value of Pi is 3.14280000000000000000

Ram

On Fri, Aug 19, 2016 at 11:46 AM, rammohan ganapavarapu <[hidden email]> wrote:
Ok, i have used yarn-utils.py to get the correct values for my cluster and update those properties and restarted RM and NM but still no luck not sure what i am missing, any other insights will help me.

Below are my properties from yarn-site.xml and map-site.xml.

python yarn-utils.py -c 24 -m 63 -d 3 -k False
 Using cores=24 memory=63GB disks=3 hbase=False
 Profile: cores=24 memory=63488MB reserved=1GB usableMem=62GB disks=3
 Num Container=6
 Container Ram=10240MB
 Used Ram=60GB
 Unused Ram=1GB
 yarn.scheduler.minimum-allocation-mb=10240
 yarn.scheduler.maximum-allocation-mb=61440
 yarn.nodemanager.resource.memory-mb=61440
 mapreduce.map.memory.mb=5120
 mapreduce.map.java.opts=-Xmx4096m
 mapreduce.reduce.memory.mb=10240
 mapreduce.reduce.java.opts=-Xmx8192m
 yarn.app.mapreduce.am.resource.mb=5120
 yarn.app.mapreduce.am.command-opts=-Xmx4096m
 mapreduce.task.io.sort.mb=1024


    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>10240</value>
    </property>
    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx8192m</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.resource.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.command-opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.task.io.sort.mb</name>
      <value>1024</value>
    </property>



     <property>
      <name>yarn.scheduler.minimum-allocation-mb</name>
      <value>10240</value>
    </property>

     <property>
      <name>yarn.scheduler.maximum-allocation-mb</name>
      <value>61440</value>
    </property>

     <property>
      <name>yarn.nodemanager.resource.memory-mb</name>
      <value>61440</value>
    </property>


Ram

On Thu, Aug 18, 2016 at 11:14 PM, tkg_cangkul <[hidden email]> wrote:
maybe this link can be some reference to tune up the cluster:

http://jason4zhu.blogspot.co.id/2014/10/memory-configuration-in-hadoop.html


On 19/08/16 11:13, rammohan ganapavarapu wrote:
Do you know what properties to tune?

Thanks,
Ram

On Thu, Aug 18, 2016 at 9:11 PM, tkg_cangkul <[hidden email]> wrote:
i think that's because you don't have enough resource.  u can tune your cluster config to maximize your resource.


On 19/08/16 11:03, rammohan ganapavarapu wrote:
I dont see any thing odd except this not sure if i have to worry about it or not.

2016-08-19 03:29:26,621 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
2016-08-19 03:29:27,646 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCo
untWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-08-19 03:29:28,647 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)


its keep printing this log ..in app container logs.

On Thu, Aug 18, 2016 at 8:20 PM, tkg_cangkul <[hidden email]> wrote:
maybe u can check the logs from port 8088 on your browser. that was RM UI. just choose your job id and then check the logs.
 
On 19/08/16 10:14, rammohan ganapavarapu wrote:
Sunil,

Thanks you for your input, below are my server metrics for RM. Also attached RM UI for capacity scheduler resources. How else i can find?

{
      "name": "Hadoop:service=ResourceManager,name=QueueMetrics,q0=root",
      "modelerType": "QueueMetrics,q0=root",
      "tag.Queue": "root",
      "tag.Context": "yarn",
      "tag.Hostname": "hadoop001",
      "running_0": 0,
      "running_60": 0,
      "running_300": 0,
      "running_1440": 0,
      "AppsSubmitted": 1,
      "AppsRunning": 0,
      "AppsPending": 0,
      "AppsCompleted": 0,
      "AppsKilled": 0,
      "AppsFailed": 1,
      "AllocatedMB": 0,
      "AllocatedVCores": 0,
      "AllocatedContainers": 0,
      "AggregateContainersAllocated": 2,
      "AggregateContainersReleased": 2,
      "AvailableMB": 64512,
      "AvailableVCores": 24,
      "PendingMB": 0,
      "PendingVCores": 0,
      "PendingContainers": 0,
      "ReservedMB": 0,
      "ReservedVCores": 0,
      "ReservedContainers": 0,
      "ActiveUsers": 0,
      "ActiveApplications": 0
    },

On Thu, Aug 18, 2016 at 6:49 PM, Sunil Govind <[hidden email]> wrote:
Hi

It could be because of many of reasons. Also I am not sure about which scheduler your are using, pls share more details such as RM log etc.

I could point out few reasons
 - Such as "Not enough resource is cluster" can cause this
 - If using Capacity Scheduler, if queue capacity is maxed out, such case can happen.
 - Similarly if max-am-resource-percent is crossed per queue level, then also AM container may not be launched.

you could check RM log to get more information if AM container is laucnhed.

Thanks
Sunil

On Fri, Aug 19, 2016 at 5:37 AM rammohan ganapavarapu <[hidden email]> wrote:
Hi,

When i submit a MR job, i am getting this from AM UI but it never get finished, what am i missing ?

Thanks,
Ram



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]










Reply | Threaded
Open this post in threaded view
|

Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

rammohan ganapavarapu

so in job.properties what is the jobtracker property, is it RM ip: port or scheduler port which is 8030, if I use 8030 I am getting unknown protocol proto buffer error.


On Aug 21, 2016 7:37 AM, "Sunil Govind" <[hidden email]> wrote:
Hi.

It seems its an oozie issue. From conf, RM scheduler is running at port 8030.
But your job.properties is taking 8032. I suggest you could double confirm your oozie configuration and see the configurations are intact to contact RM. Sharing a link also

Thanks
Sunil


On Sun, Aug 21, 2016 at 8:41 AM rammohan ganapavarapu <[hidden email]> wrote:
Please find the attached config that i got from yarn ui and  AM,RM logs. I only see that connecting to 0.0.0.0:8030 when i submit job using oozie, but if i submit as yarn jar its working fine as i posted in my previous posts.

Here is my oozie job.properties file, i have a java class that just prints

nameNode=hdfs://master01:8020
jobTracker=master01:8032
workflowName=EchoJavaJob
oozie.use.system.libpath=true

queueName=default
hdfsWorkflowHome=/user/uap/oozieWorkflows

workflowPath=${nameNode}${hdfsWorkflowHome}/${workflowName}
oozie.wf.application.path=${workflowPath}

Please let me know if you guys find any clue why its trying to connect to 0.0.0.:8030.

Thanks,
Ram


On Fri, Aug 19, 2016 at 11:54 PM, Sunil Govind <[hidden email]> wrote:
Hi Ram

From the console log, as Rohith said, AM is looking for AM at 8030. So pls confirm the RM port once.
Could you please share AM and RM logs.

Thanks 
Sunil

On Sat, Aug 20, 2016 at 10:36 AM rammohan ganapavarapu <[hidden email]> wrote:

yes, I did configured.


On Aug 19, 2016 7:22 PM, "Rohith Sharma K S" <[hidden email]> wrote:
Hi

From below discussion and AM logs, I see that AM container has launched but not able to connect to RM.

This looks like your configuration issue. Would you check your job.xml jar that does yarn.resourcemanager.scheduler.address has been configured? 

Essentially, this address required by MRAppMaster for connecting to RM for heartbeats. If you don’t not configure, default value will be taken i.e 8030.


Thanks & Regards
Rohith Sharma K S

On Aug 20, 2016, at 7:02 AM, rammohan ganapavarapu <[hidden email]> wrote:

Even if  the cluster dont have enough resources it should connect to "
/0.0.0.0:8030" right? it should connect to my <RM_HOST:8030>, not sure why its trying to connect to 0.0.0.0:8030.
I have verified the config and i removed traces of 0.0.0.0 still no luck.
org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030

If an one has any clue please share.

Thanks,
Ram


On Fri, Aug 19, 2016 at 2:32 PM, rammohan ganapavarapu <[hidden email]> wrote:
When i submit a job using yarn its seems working only with oozie its failing i guess, not sure what is missing.

yarn jar /uap/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar pi 20 1000
Number of Maps  = 20
Samples per Map = 1000
.
.
.
Job Finished in 19.622 seconds
Estimated value of Pi is 3.14280000000000000000

Ram

On Fri, Aug 19, 2016 at 11:46 AM, rammohan ganapavarapu <[hidden email]> wrote:
Ok, i have used yarn-utils.py to get the correct values for my cluster and update those properties and restarted RM and NM but still no luck not sure what i am missing, any other insights will help me.

Below are my properties from yarn-site.xml and map-site.xml.

python yarn-utils.py -c 24 -m 63 -d 3 -k False
 Using cores=24 memory=63GB disks=3 hbase=False
 Profile: cores=24 memory=63488MB reserved=1GB usableMem=62GB disks=3
 Num Container=6
 Container Ram=10240MB
 Used Ram=60GB
 Unused Ram=1GB
 yarn.scheduler.minimum-allocation-mb=10240
 yarn.scheduler.maximum-allocation-mb=61440
 yarn.nodemanager.resource.memory-mb=61440
 mapreduce.map.memory.mb=5120
 mapreduce.map.java.opts=-Xmx4096m
 mapreduce.reduce.memory.mb=10240
 mapreduce.reduce.java.opts=-Xmx8192m
 yarn.app.mapreduce.am.resource.mb=5120
 yarn.app.mapreduce.am.command-opts=-Xmx4096m
 mapreduce.task.io.sort.mb=1024


    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>10240</value>
    </property>
    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx8192m</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.resource.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.command-opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.task.io.sort.mb</name>
      <value>1024</value>
    </property>



     <property>
      <name>yarn.scheduler.minimum-allocation-mb</name>
      <value>10240</value>
    </property>

     <property>
      <name>yarn.scheduler.maximum-allocation-mb</name>
      <value>61440</value>
    </property>

     <property>
      <name>yarn.nodemanager.resource.memory-mb</name>
      <value>61440</value>
    </property>


Ram

On Thu, Aug 18, 2016 at 11:14 PM, tkg_cangkul <[hidden email]> wrote:
maybe this link can be some reference to tune up the cluster:

http://jason4zhu.blogspot.co.id/2014/10/memory-configuration-in-hadoop.html


On 19/08/16 11:13, rammohan ganapavarapu wrote:
Do you know what properties to tune?

Thanks,
Ram

On Thu, Aug 18, 2016 at 9:11 PM, tkg_cangkul <[hidden email]> wrote:
i think that's because you don't have enough resource.  u can tune your cluster config to maximize your resource.


On 19/08/16 11:03, rammohan ganapavarapu wrote:
I dont see any thing odd except this not sure if i have to worry about it or not.

2016-08-19 03:29:26,621 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
2016-08-19 03:29:27,646 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCo
untWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-08-19 03:29:28,647 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)


its keep printing this log ..in app container logs.

On Thu, Aug 18, 2016 at 8:20 PM, tkg_cangkul <[hidden email]> wrote:
maybe u can check the logs from port 8088 on your browser. that was RM UI. just choose your job id and then check the logs.
 
On 19/08/16 10:14, rammohan ganapavarapu wrote:
Sunil,

Thanks you for your input, below are my server metrics for RM. Also attached RM UI for capacity scheduler resources. How else i can find?

{
      "name": "Hadoop:service=ResourceManager,name=QueueMetrics,q0=root",
      "modelerType": "QueueMetrics,q0=root",
      "tag.Queue": "root",
      "tag.Context": "yarn",
      "tag.Hostname": "hadoop001",
      "running_0": 0,
      "running_60": 0,
      "running_300": 0,
      "running_1440": 0,
      "AppsSubmitted": 1,
      "AppsRunning": 0,
      "AppsPending": 0,
      "AppsCompleted": 0,
      "AppsKilled": 0,
      "AppsFailed": 1,
      "AllocatedMB": 0,
      "AllocatedVCores": 0,
      "AllocatedContainers": 0,
      "AggregateContainersAllocated": 2,
      "AggregateContainersReleased": 2,
      "AvailableMB": 64512,
      "AvailableVCores": 24,
      "PendingMB": 0,
      "PendingVCores": 0,
      "PendingContainers": 0,
      "ReservedMB": 0,
      "ReservedVCores": 0,
      "ReservedContainers": 0,
      "ActiveUsers": 0,
      "ActiveApplications": 0
    },

On Thu, Aug 18, 2016 at 6:49 PM, Sunil Govind <[hidden email]> wrote:
Hi

It could be because of many of reasons. Also I am not sure about which scheduler your are using, pls share more details such as RM log etc.

I could point out few reasons
 - Such as "Not enough resource is cluster" can cause this
 - If using Capacity Scheduler, if queue capacity is maxed out, such case can happen.
 - Similarly if max-am-resource-percent is crossed per queue level, then also AM container may not be launched.

you could check RM log to get more information if AM container is laucnhed.

Thanks
Sunil

On Fri, Aug 19, 2016 at 5:37 AM rammohan ganapavarapu <[hidden email]> wrote:
Hi,

When i submit a MR job, i am getting this from AM UI but it never get finished, what am i missing ?

Thanks,
Ram



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]










Reply | Threaded
Open this post in threaded view
|

Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

rammohan ganapavarapu

any thoughts from the logs and config I have shared?


On Aug 21, 2016 8:32 AM, "rammohan ganapavarapu" <[hidden email]> wrote:

so in job.properties what is the jobtracker property, is it RM ip: port or scheduler port which is 8030, if I use 8030 I am getting unknown protocol proto buffer error.


On Aug 21, 2016 7:37 AM, "Sunil Govind" <[hidden email]> wrote:
Hi.

It seems its an oozie issue. From conf, RM scheduler is running at port 8030.
But your job.properties is taking 8032. I suggest you could double confirm your oozie configuration and see the configurations are intact to contact RM. Sharing a link also

Thanks
Sunil


On Sun, Aug 21, 2016 at 8:41 AM rammohan ganapavarapu <[hidden email]> wrote:
Please find the attached config that i got from yarn ui and  AM,RM logs. I only see that connecting to 0.0.0.0:8030 when i submit job using oozie, but if i submit as yarn jar its working fine as i posted in my previous posts.

Here is my oozie job.properties file, i have a java class that just prints

nameNode=hdfs://master01:8020
jobTracker=master01:8032
workflowName=EchoJavaJob
oozie.use.system.libpath=true

queueName=default
hdfsWorkflowHome=/user/uap/oozieWorkflows

workflowPath=${nameNode}${hdfsWorkflowHome}/${workflowName}
oozie.wf.application.path=${workflowPath}

Please let me know if you guys find any clue why its trying to connect to 0.0.0.:8030.

Thanks,
Ram


On Fri, Aug 19, 2016 at 11:54 PM, Sunil Govind <[hidden email]> wrote:
Hi Ram

From the console log, as Rohith said, AM is looking for AM at 8030. So pls confirm the RM port once.
Could you please share AM and RM logs.

Thanks 
Sunil

On Sat, Aug 20, 2016 at 10:36 AM rammohan ganapavarapu <[hidden email]> wrote:

yes, I did configured.


On Aug 19, 2016 7:22 PM, "Rohith Sharma K S" <[hidden email]> wrote:
Hi

From below discussion and AM logs, I see that AM container has launched but not able to connect to RM.

This looks like your configuration issue. Would you check your job.xml jar that does yarn.resourcemanager.scheduler.address has been configured? 

Essentially, this address required by MRAppMaster for connecting to RM for heartbeats. If you don’t not configure, default value will be taken i.e 8030.


Thanks & Regards
Rohith Sharma K S

On Aug 20, 2016, at 7:02 AM, rammohan ganapavarapu <[hidden email]> wrote:

Even if  the cluster dont have enough resources it should connect to "
/0.0.0.0:8030" right? it should connect to my <RM_HOST:8030>, not sure why its trying to connect to 0.0.0.0:8030.
I have verified the config and i removed traces of 0.0.0.0 still no luck.
org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030

If an one has any clue please share.

Thanks,
Ram


On Fri, Aug 19, 2016 at 2:32 PM, rammohan ganapavarapu <[hidden email]> wrote:
When i submit a job using yarn its seems working only with oozie its failing i guess, not sure what is missing.

yarn jar /uap/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar pi 20 1000
Number of Maps  = 20
Samples per Map = 1000
.
.
.
Job Finished in 19.622 seconds
Estimated value of Pi is 3.14280000000000000000

Ram

On Fri, Aug 19, 2016 at 11:46 AM, rammohan ganapavarapu <[hidden email]> wrote:
Ok, i have used yarn-utils.py to get the correct values for my cluster and update those properties and restarted RM and NM but still no luck not sure what i am missing, any other insights will help me.

Below are my properties from yarn-site.xml and map-site.xml.

python yarn-utils.py -c 24 -m 63 -d 3 -k False
 Using cores=24 memory=63GB disks=3 hbase=False
 Profile: cores=24 memory=63488MB reserved=1GB usableMem=62GB disks=3
 Num Container=6
 Container Ram=10240MB
 Used Ram=60GB
 Unused Ram=1GB
 yarn.scheduler.minimum-allocation-mb=10240
 yarn.scheduler.maximum-allocation-mb=61440
 yarn.nodemanager.resource.memory-mb=61440
 mapreduce.map.memory.mb=5120
 mapreduce.map.java.opts=-Xmx4096m
 mapreduce.reduce.memory.mb=10240
 mapreduce.reduce.java.opts=-Xmx8192m
 yarn.app.mapreduce.am.resource.mb=5120
 yarn.app.mapreduce.am.command-opts=-Xmx4096m
 mapreduce.task.io.sort.mb=1024


    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>10240</value>
    </property>
    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx8192m</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.resource.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.command-opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.task.io.sort.mb</name>
      <value>1024</value>
    </property>



     <property>
      <name>yarn.scheduler.minimum-allocation-mb</name>
      <value>10240</value>
    </property>

     <property>
      <name>yarn.scheduler.maximum-allocation-mb</name>
      <value>61440</value>
    </property>

     <property>
      <name>yarn.nodemanager.resource.memory-mb</name>
      <value>61440</value>
    </property>


Ram

On Thu, Aug 18, 2016 at 11:14 PM, tkg_cangkul <[hidden email]> wrote:
maybe this link can be some reference to tune up the cluster:

http://jason4zhu.blogspot.co.id/2014/10/memory-configuration-in-hadoop.html


On 19/08/16 11:13, rammohan ganapavarapu wrote:
Do you know what properties to tune?

Thanks,
Ram

On Thu, Aug 18, 2016 at 9:11 PM, tkg_cangkul <[hidden email]> wrote:
i think that's because you don't have enough resource.  u can tune your cluster config to maximize your resource.


On 19/08/16 11:03, rammohan ganapavarapu wrote:
I dont see any thing odd except this not sure if i have to worry about it or not.

2016-08-19 03:29:26,621 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
2016-08-19 03:29:27,646 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCo
untWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-08-19 03:29:28,647 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)


its keep printing this log ..in app container logs.

On Thu, Aug 18, 2016 at 8:20 PM, tkg_cangkul <[hidden email]> wrote:
maybe u can check the logs from port 8088 on your browser. that was RM UI. just choose your job id and then check the logs.
 
On 19/08/16 10:14, rammohan ganapavarapu wrote:
Sunil,

Thanks you for your input, below are my server metrics for RM. Also attached RM UI for capacity scheduler resources. How else i can find?

{
      "name": "Hadoop:service=ResourceManager,name=QueueMetrics,q0=root",
      "modelerType": "QueueMetrics,q0=root",
      "tag.Queue": "root",
      "tag.Context": "yarn",
      "tag.Hostname": "hadoop001",
      "running_0": 0,
      "running_60": 0,
      "running_300": 0,
      "running_1440": 0,
      "AppsSubmitted": 1,
      "AppsRunning": 0,
      "AppsPending": 0,
      "AppsCompleted": 0,
      "AppsKilled": 0,
      "AppsFailed": 1,
      "AllocatedMB": 0,
      "AllocatedVCores": 0,
      "AllocatedContainers": 0,
      "AggregateContainersAllocated": 2,
      "AggregateContainersReleased": 2,
      "AvailableMB": 64512,
      "AvailableVCores": 24,
      "PendingMB": 0,
      "PendingVCores": 0,
      "PendingContainers": 0,
      "ReservedMB": 0,
      "ReservedVCores": 0,
      "ReservedContainers": 0,
      "ActiveUsers": 0,
      "ActiveApplications": 0
    },

On Thu, Aug 18, 2016 at 6:49 PM, Sunil Govind <[hidden email]> wrote:
Hi

It could be because of many of reasons. Also I am not sure about which scheduler your are using, pls share more details such as RM log etc.

I could point out few reasons
 - Such as "Not enough resource is cluster" can cause this
 - If using Capacity Scheduler, if queue capacity is maxed out, such case can happen.
 - Similarly if max-am-resource-percent is crossed per queue level, then also AM container may not be launched.

you could check RM log to get more information if AM container is laucnhed.

Thanks
Sunil

On Fri, Aug 19, 2016 at 5:37 AM rammohan ganapavarapu <[hidden email]> wrote:
Hi,

When i submit a MR job, i am getting this from AM UI but it never get finished, what am i missing ?

Thanks,
Ram



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]










Reply | Threaded
Open this post in threaded view
|

Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

Sunil Govind
HI Ram

RM logs looks fine and as per config it looks like RM is running on 8030 itself.
I am not very sure about the oozie end config which you mentioned. I suggest you could check the config end more and debug there.
Also will let other community folks to pitch in if they have some other opinion.

Thanks
Sunil

On Mon, Aug 22, 2016 at 8:57 PM rammohan ganapavarapu <[hidden email]> wrote:

any thoughts from the logs and config I have shared?


On Aug 21, 2016 8:32 AM, "rammohan ganapavarapu" <[hidden email]> wrote:

so in job.properties what is the jobtracker property, is it RM ip: port or scheduler port which is 8030, if I use 8030 I am getting unknown protocol proto buffer error.


On Aug 21, 2016 7:37 AM, "Sunil Govind" <[hidden email]> wrote:
Hi.

It seems its an oozie issue. From conf, RM scheduler is running at port 8030.
But your job.properties is taking 8032. I suggest you could double confirm your oozie configuration and see the configurations are intact to contact RM. Sharing a link also

Thanks
Sunil


On Sun, Aug 21, 2016 at 8:41 AM rammohan ganapavarapu <[hidden email]> wrote:
Please find the attached config that i got from yarn ui and  AM,RM logs. I only see that connecting to 0.0.0.0:8030 when i submit job using oozie, but if i submit as yarn jar its working fine as i posted in my previous posts.

Here is my oozie job.properties file, i have a java class that just prints

nameNode=hdfs://master01:8020
jobTracker=master01:8032
workflowName=EchoJavaJob
oozie.use.system.libpath=true

queueName=default
hdfsWorkflowHome=/user/uap/oozieWorkflows

workflowPath=${nameNode}${hdfsWorkflowHome}/${workflowName}
oozie.wf.application.path=${workflowPath}

Please let me know if you guys find any clue why its trying to connect to 0.0.0.:8030.

Thanks,
Ram


On Fri, Aug 19, 2016 at 11:54 PM, Sunil Govind <[hidden email]> wrote:
Hi Ram

From the console log, as Rohith said, AM is looking for AM at 8030. So pls confirm the RM port once.
Could you please share AM and RM logs.

Thanks 
Sunil

On Sat, Aug 20, 2016 at 10:36 AM rammohan ganapavarapu <[hidden email]> wrote:

yes, I did configured.


On Aug 19, 2016 7:22 PM, "Rohith Sharma K S" <[hidden email]> wrote:
Hi

From below discussion and AM logs, I see that AM container has launched but not able to connect to RM.

This looks like your configuration issue. Would you check your job.xml jar that does yarn.resourcemanager.scheduler.address has been configured? 

Essentially, this address required by MRAppMaster for connecting to RM for heartbeats. If you don’t not configure, default value will be taken i.e 8030.


Thanks & Regards
Rohith Sharma K S

On Aug 20, 2016, at 7:02 AM, rammohan ganapavarapu <[hidden email]> wrote:

Even if  the cluster dont have enough resources it should connect to "
/0.0.0.0:8030" right? it should connect to my <RM_HOST:8030>, not sure why its trying to connect to 0.0.0.0:8030.
I have verified the config and i removed traces of 0.0.0.0 still no luck.
org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030

If an one has any clue please share.

Thanks,
Ram


On Fri, Aug 19, 2016 at 2:32 PM, rammohan ganapavarapu <[hidden email]> wrote:
When i submit a job using yarn its seems working only with oozie its failing i guess, not sure what is missing.

yarn jar /uap/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar pi 20 1000
Number of Maps  = 20
Samples per Map = 1000
.
.
.
Job Finished in 19.622 seconds
Estimated value of Pi is 3.14280000000000000000

Ram

On Fri, Aug 19, 2016 at 11:46 AM, rammohan ganapavarapu <[hidden email]> wrote:
Ok, i have used yarn-utils.py to get the correct values for my cluster and update those properties and restarted RM and NM but still no luck not sure what i am missing, any other insights will help me.

Below are my properties from yarn-site.xml and map-site.xml.

python yarn-utils.py -c 24 -m 63 -d 3 -k False
 Using cores=24 memory=63GB disks=3 hbase=False
 Profile: cores=24 memory=63488MB reserved=1GB usableMem=62GB disks=3
 Num Container=6
 Container Ram=10240MB
 Used Ram=60GB
 Unused Ram=1GB
 yarn.scheduler.minimum-allocation-mb=10240
 yarn.scheduler.maximum-allocation-mb=61440
 yarn.nodemanager.resource.memory-mb=61440
 mapreduce.map.memory.mb=5120
 mapreduce.map.java.opts=-Xmx4096m
 mapreduce.reduce.memory.mb=10240
 mapreduce.reduce.java.opts=-Xmx8192m
 yarn.app.mapreduce.am.resource.mb=5120
 yarn.app.mapreduce.am.command-opts=-Xmx4096m
 mapreduce.task.io.sort.mb=1024


    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>10240</value>
    </property>
    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx8192m</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.resource.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.command-opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.task.io.sort.mb</name>
      <value>1024</value>
    </property>



     <property>
      <name>yarn.scheduler.minimum-allocation-mb</name>
      <value>10240</value>
    </property>

     <property>
      <name>yarn.scheduler.maximum-allocation-mb</name>
      <value>61440</value>
    </property>

     <property>
      <name>yarn.nodemanager.resource.memory-mb</name>
      <value>61440</value>
    </property>


Ram

On Thu, Aug 18, 2016 at 11:14 PM, tkg_cangkul <[hidden email]> wrote:
maybe this link can be some reference to tune up the cluster:

http://jason4zhu.blogspot.co.id/2014/10/memory-configuration-in-hadoop.html


On 19/08/16 11:13, rammohan ganapavarapu wrote:
Do you know what properties to tune?

Thanks,
Ram

On Thu, Aug 18, 2016 at 9:11 PM, tkg_cangkul <[hidden email]> wrote:
i think that's because you don't have enough resource.  u can tune your cluster config to maximize your resource.


On 19/08/16 11:03, rammohan ganapavarapu wrote:
I dont see any thing odd except this not sure if i have to worry about it or not.

2016-08-19 03:29:26,621 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
2016-08-19 03:29:27,646 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCo
untWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-08-19 03:29:28,647 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)


its keep printing this log ..in app container logs.

On Thu, Aug 18, 2016 at 8:20 PM, tkg_cangkul <[hidden email]> wrote:
maybe u can check the logs from port 8088 on your browser. that was RM UI. just choose your job id and then check the logs.
 
On 19/08/16 10:14, rammohan ganapavarapu wrote:
Sunil,

Thanks you for your input, below are my server metrics for RM. Also attached RM UI for capacity scheduler resources. How else i can find?

{
      "name": "Hadoop:service=ResourceManager,name=QueueMetrics,q0=root",
      "modelerType": "QueueMetrics,q0=root",
      "tag.Queue": "root",
      "tag.Context": "yarn",
      "tag.Hostname": "hadoop001",
      "running_0": 0,
      "running_60": 0,
      "running_300": 0,
      "running_1440": 0,
      "AppsSubmitted": 1,
      "AppsRunning": 0,
      "AppsPending": 0,
      "AppsCompleted": 0,
      "AppsKilled": 0,
      "AppsFailed": 1,
      "AllocatedMB": 0,
      "AllocatedVCores": 0,
      "AllocatedContainers": 0,
      "AggregateContainersAllocated": 2,
      "AggregateContainersReleased": 2,
      "AvailableMB": 64512,
      "AvailableVCores": 24,
      "PendingMB": 0,
      "PendingVCores": 0,
      "PendingContainers": 0,
      "ReservedMB": 0,
      "ReservedVCores": 0,
      "ReservedContainers": 0,
      "ActiveUsers": 0,
      "ActiveApplications": 0
    },

On Thu, Aug 18, 2016 at 6:49 PM, Sunil Govind <[hidden email]> wrote:
Hi

It could be because of many of reasons. Also I am not sure about which scheduler your are using, pls share more details such as RM log etc.

I could point out few reasons
 - Such as "Not enough resource is cluster" can cause this
 - If using Capacity Scheduler, if queue capacity is maxed out, such case can happen.
 - Similarly if max-am-resource-percent is crossed per queue level, then also AM container may not be launched.

you could check RM log to get more information if AM container is laucnhed.

Thanks
Sunil

On Fri, Aug 19, 2016 at 5:37 AM rammohan ganapavarapu <[hidden email]> wrote:
Hi,

When i submit a MR job, i am getting this from AM UI but it never get finished, what am i missing ?

Thanks,
Ram



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]










Reply | Threaded
Open this post in threaded view
|

Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

rammohan ganapavarapu
Thank you all, I have updated my oozie job.properties to use 8030 and now i am getting below error





    2016-08-22 17:22:02,893 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: NodeManager from node slave03(cmPort: 40511 httpPort: 8042) registered with capability: <memory:8192, vCores:8>, assigned nodeId slave03:40511
2016-08-22 17:22:02,893 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: slave03:40511 Node Transitioned from NEW to RUNNING
2016-08-22 17:22:02,893 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Added node slave03:40511 clusterResource: <memory:24576, vCores:24>
2016-08-22 17:23:14,258 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 for port 8030: readAndProcess from client 10.16.3.51 threw exception [org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled.  Available:[TOKEN]]
org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled.  Available:[TOKEN]
        at org.apache.hadoop.ipc.Server$Connection.initializeAuthContext(Server.java:1564)
        at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1520)
        at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:771)
        at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:637)
        at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:608)


So i did enabled simple auth by below config in core-site.xml and restarted namenode,datanode,rm and nm but still getting same error do i have to do any thing else to enable simple auth?


    <property>
      <name>hadoop.security.authentication</name>
      <value>simple</value>
    </property>


Ram

On Mon, Aug 22, 2016 at 9:43 AM, Sunil Govind <[hidden email]> wrote:
HI Ram

RM logs looks fine and as per config it looks like RM is running on 8030 itself.
I am not very sure about the oozie end config which you mentioned. I suggest you could check the config end more and debug there.
Also will let other community folks to pitch in if they have some other opinion.

Thanks
Sunil

On Mon, Aug 22, 2016 at 8:57 PM rammohan ganapavarapu <[hidden email]> wrote:

any thoughts from the logs and config I have shared?


On Aug 21, 2016 8:32 AM, "rammohan ganapavarapu" <[hidden email]> wrote:

so in job.properties what is the jobtracker property, is it RM ip: port or scheduler port which is 8030, if I use 8030 I am getting unknown protocol proto buffer error.


On Aug 21, 2016 7:37 AM, "Sunil Govind" <[hidden email]> wrote:
Hi.

It seems its an oozie issue. From conf, RM scheduler is running at port 8030.
But your job.properties is taking 8032. I suggest you could double confirm your oozie configuration and see the configurations are intact to contact RM. Sharing a link also

Thanks
Sunil


On Sun, Aug 21, 2016 at 8:41 AM rammohan ganapavarapu <[hidden email]> wrote:
Please find the attached config that i got from yarn ui and  AM,RM logs. I only see that connecting to 0.0.0.0:8030 when i submit job using oozie, but if i submit as yarn jar its working fine as i posted in my previous posts.

Here is my oozie job.properties file, i have a java class that just prints

nameNode=hdfs://master01:8020
jobTracker=master01:8032
workflowName=EchoJavaJob
oozie.use.system.libpath=true

queueName=default
hdfsWorkflowHome=/user/uap/oozieWorkflows

workflowPath=${nameNode}${hdfsWorkflowHome}/${workflowName}
oozie.wf.application.path=${workflowPath}

Please let me know if you guys find any clue why its trying to connect to 0.0.0.:8030.

Thanks,
Ram


On Fri, Aug 19, 2016 at 11:54 PM, Sunil Govind <[hidden email]> wrote:
Hi Ram

From the console log, as Rohith said, AM is looking for AM at 8030. So pls confirm the RM port once.
Could you please share AM and RM logs.

Thanks 
Sunil

On Sat, Aug 20, 2016 at 10:36 AM rammohan ganapavarapu <[hidden email]> wrote:

yes, I did configured.


On Aug 19, 2016 7:22 PM, "Rohith Sharma K S" <[hidden email]> wrote:
Hi

From below discussion and AM logs, I see that AM container has launched but not able to connect to RM.

This looks like your configuration issue. Would you check your job.xml jar that does yarn.resourcemanager.scheduler.address has been configured? 

Essentially, this address required by MRAppMaster for connecting to RM for heartbeats. If you don’t not configure, default value will be taken i.e 8030.


Thanks & Regards
Rohith Sharma K S

On Aug 20, 2016, at 7:02 AM, rammohan ganapavarapu <[hidden email]> wrote:

Even if  the cluster dont have enough resources it should connect to "
/0.0.0.0:8030" right? it should connect to my <RM_HOST:8030>, not sure why its trying to connect to 0.0.0.0:8030.
I have verified the config and i removed traces of 0.0.0.0 still no luck.
org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030

If an one has any clue please share.

Thanks,
Ram


On Fri, Aug 19, 2016 at 2:32 PM, rammohan ganapavarapu <[hidden email]> wrote:
When i submit a job using yarn its seems working only with oozie its failing i guess, not sure what is missing.

yarn jar /uap/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar pi 20 1000
Number of Maps  = 20
Samples per Map = 1000
.
.
.
Job Finished in 19.622 seconds
Estimated value of Pi is 3.14280000000000000000

Ram

On Fri, Aug 19, 2016 at 11:46 AM, rammohan ganapavarapu <[hidden email]> wrote:
Ok, i have used yarn-utils.py to get the correct values for my cluster and update those properties and restarted RM and NM but still no luck not sure what i am missing, any other insights will help me.

Below are my properties from yarn-site.xml and map-site.xml.

python yarn-utils.py -c 24 -m 63 -d 3 -k False
 Using cores=24 memory=63GB disks=3 hbase=False
 Profile: cores=24 memory=63488MB reserved=1GB usableMem=62GB disks=3
 Num Container=6
 Container Ram=10240MB
 Used Ram=60GB
 Unused Ram=1GB
 yarn.scheduler.minimum-allocation-mb=10240
 yarn.scheduler.maximum-allocation-mb=61440
 yarn.nodemanager.resource.memory-mb=61440
 mapreduce.map.memory.mb=5120
 mapreduce.map.java.opts=-Xmx4096m
 mapreduce.reduce.memory.mb=10240
 mapreduce.reduce.java.opts=-Xmx8192m
 yarn.app.mapreduce.am.resource.mb=5120
 yarn.app.mapreduce.am.command-opts=-Xmx4096m
 mapreduce.task.io.sort.mb=1024


    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>10240</value>
    </property>
    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx8192m</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.resource.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.command-opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.task.io.sort.mb</name>
      <value>1024</value>
    </property>



     <property>
      <name>yarn.scheduler.minimum-allocation-mb</name>
      <value>10240</value>
    </property>

     <property>
      <name>yarn.scheduler.maximum-allocation-mb</name>
      <value>61440</value>
    </property>

     <property>
      <name>yarn.nodemanager.resource.memory-mb</name>
      <value>61440</value>
    </property>


Ram

On Thu, Aug 18, 2016 at 11:14 PM, tkg_cangkul <[hidden email]> wrote:
maybe this link can be some reference to tune up the cluster:

http://jason4zhu.blogspot.co.id/2014/10/memory-configuration-in-hadoop.html


On 19/08/16 11:13, rammohan ganapavarapu wrote:
Do you know what properties to tune?

Thanks,
Ram

On Thu, Aug 18, 2016 at 9:11 PM, tkg_cangkul <[hidden email]> wrote:
i think that's because you don't have enough resource.  u can tune your cluster config to maximize your resource.


On 19/08/16 11:03, rammohan ganapavarapu wrote:
I dont see any thing odd except this not sure if i have to worry about it or not.

2016-08-19 03:29:26,621 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
2016-08-19 03:29:27,646 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCo
untWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-08-19 03:29:28,647 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)


its keep printing this log ..in app container logs.

On Thu, Aug 18, 2016 at 8:20 PM, tkg_cangkul <[hidden email]> wrote:
maybe u can check the logs from port 8088 on your browser. that was RM UI. just choose your job id and then check the logs.
 
On 19/08/16 10:14, rammohan ganapavarapu wrote:
Sunil,

Thanks you for your input, below are my server metrics for RM. Also attached RM UI for capacity scheduler resources. How else i can find?

{
      "name": "Hadoop:service=ResourceManager,name=QueueMetrics,q0=root",
      "modelerType": "QueueMetrics,q0=root",
      "tag.Queue": "root",
      "tag.Context": "yarn",
      "tag.Hostname": "hadoop001",
      "running_0": 0,
      "running_60": 0,
      "running_300": 0,
      "running_1440": 0,
      "AppsSubmitted": 1,
      "AppsRunning": 0,
      "AppsPending": 0,
      "AppsCompleted": 0,
      "AppsKilled": 0,
      "AppsFailed": 1,
      "AllocatedMB": 0,
      "AllocatedVCores": 0,
      "AllocatedContainers": 0,
      "AggregateContainersAllocated": 2,
      "AggregateContainersReleased": 2,
      "AvailableMB": 64512,
      "AvailableVCores": 24,
      "PendingMB": 0,
      "PendingVCores": 0,
      "PendingContainers": 0,
      "ReservedMB": 0,
      "ReservedVCores": 0,
      "ReservedContainers": 0,
      "ActiveUsers": 0,
      "ActiveApplications": 0
    },

On Thu, Aug 18, 2016 at 6:49 PM, Sunil Govind <[hidden email]> wrote:
Hi

It could be because of many of reasons. Also I am not sure about which scheduler your are using, pls share more details such as RM log etc.

I could point out few reasons
 - Such as "Not enough resource is cluster" can cause this
 - If using Capacity Scheduler, if queue capacity is maxed out, such case can happen.
 - Similarly if max-am-resource-percent is crossed per queue level, then also AM container may not be launched.

you could check RM log to get more information if AM container is laucnhed.

Thanks
Sunil

On Fri, Aug 19, 2016 at 5:37 AM rammohan ganapavarapu <[hidden email]> wrote:
Hi,

When i submit a MR job, i am getting this from AM UI but it never get finished, what am i missing ?

Thanks,
Ram



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]











Reply | Threaded
Open this post in threaded view
|

Re: ACCEPTED: waiting for AM container to be allocated, launched and register with RM

rammohan ganapavarapu
Guys,

I was able to fix this issue but trail and error not sure which property made it work :) but its working and i have to use 8032 as jobtracker. I also restarted all the components before i was only restarting nodemanager and resource manager after property update.

You get this error if you use wrong port

Socket Reader #1 for port 8030: readAndProcess from client 10.16.3.51 threw exception [org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled.  Available:[TOKEN]]
org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled.  Available:[TOKEN]

Thanks a lot for all your help,

Ram

On Mon, Aug 22, 2016 at 10:27 AM, rammohan ganapavarapu <[hidden email]> wrote:
Thank you all, I have updated my oozie job.properties to use 8030 and now i am getting below error





    2016-08-22 17:22:02,893 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: NodeManager from node slave03(cmPort: 40511 httpPort: 8042) registered with capability: <memory:8192, vCores:8>, assigned nodeId slave03:40511
2016-08-22 17:22:02,893 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: slave03:40511 Node Transitioned from NEW to RUNNING
2016-08-22 17:22:02,893 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Added node slave03:40511 clusterResource: <memory:24576, vCores:24>
2016-08-22 17:23:14,258 INFO org.apache.hadoop.ipc.Server: Socket Reader #1 for port 8030: readAndProcess from client 10.16.3.51 threw exception [org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled.  Available:[TOKEN]]
org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled.  Available:[TOKEN]
        at org.apache.hadoop.ipc.Server$Connection.initializeAuthContext(Server.java:1564)
        at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1520)
        at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:771)
        at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:637)
        at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:608)


So i did enabled simple auth by below config in core-site.xml and restarted namenode,datanode,rm and nm but still getting same error do i have to do any thing else to enable simple auth?


    <property>
      <name>hadoop.security.authentication</name>
      <value>simple</value>
    </property>


Ram

On Mon, Aug 22, 2016 at 9:43 AM, Sunil Govind <[hidden email]> wrote:
HI Ram

RM logs looks fine and as per config it looks like RM is running on 8030 itself.
I am not very sure about the oozie end config which you mentioned. I suggest you could check the config end more and debug there.
Also will let other community folks to pitch in if they have some other opinion.

Thanks
Sunil

On Mon, Aug 22, 2016 at 8:57 PM rammohan ganapavarapu <[hidden email]> wrote:

any thoughts from the logs and config I have shared?


On Aug 21, 2016 8:32 AM, "rammohan ganapavarapu" <[hidden email]> wrote:

so in job.properties what is the jobtracker property, is it RM ip: port or scheduler port which is 8030, if I use 8030 I am getting unknown protocol proto buffer error.


On Aug 21, 2016 7:37 AM, "Sunil Govind" <[hidden email]> wrote:
Hi.

It seems its an oozie issue. From conf, RM scheduler is running at port 8030.
But your job.properties is taking 8032. I suggest you could double confirm your oozie configuration and see the configurations are intact to contact RM. Sharing a link also

Thanks
Sunil


On Sun, Aug 21, 2016 at 8:41 AM rammohan ganapavarapu <[hidden email]> wrote:
Please find the attached config that i got from yarn ui and  AM,RM logs. I only see that connecting to 0.0.0.0:8030 when i submit job using oozie, but if i submit as yarn jar its working fine as i posted in my previous posts.

Here is my oozie job.properties file, i have a java class that just prints

nameNode=hdfs://master01:8020
jobTracker=master01:8032
workflowName=EchoJavaJob
oozie.use.system.libpath=true

queueName=default
hdfsWorkflowHome=/user/uap/oozieWorkflows

workflowPath=${nameNode}${hdfsWorkflowHome}/${workflowName}
oozie.wf.application.path=${workflowPath}

Please let me know if you guys find any clue why its trying to connect to 0.0.0.:8030.

Thanks,
Ram


On Fri, Aug 19, 2016 at 11:54 PM, Sunil Govind <[hidden email]> wrote:
Hi Ram

From the console log, as Rohith said, AM is looking for AM at 8030. So pls confirm the RM port once.
Could you please share AM and RM logs.

Thanks 
Sunil

On Sat, Aug 20, 2016 at 10:36 AM rammohan ganapavarapu <[hidden email]> wrote:

yes, I did configured.


On Aug 19, 2016 7:22 PM, "Rohith Sharma K S" <[hidden email]> wrote:
Hi

From below discussion and AM logs, I see that AM container has launched but not able to connect to RM.

This looks like your configuration issue. Would you check your job.xml jar that does yarn.resourcemanager.scheduler.address has been configured? 

Essentially, this address required by MRAppMaster for connecting to RM for heartbeats. If you don’t not configure, default value will be taken i.e 8030.


Thanks & Regards
Rohith Sharma K S

On Aug 20, 2016, at 7:02 AM, rammohan ganapavarapu <[hidden email]> wrote:

Even if  the cluster dont have enough resources it should connect to "
/0.0.0.0:8030" right? it should connect to my <RM_HOST:8030>, not sure why its trying to connect to 0.0.0.0:8030.
I have verified the config and i removed traces of 0.0.0.0 still no luck.
org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030

If an one has any clue please share.

Thanks,
Ram


On Fri, Aug 19, 2016 at 2:32 PM, rammohan ganapavarapu <[hidden email]> wrote:
When i submit a job using yarn its seems working only with oozie its failing i guess, not sure what is missing.

yarn jar /uap/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar pi 20 1000
Number of Maps  = 20
Samples per Map = 1000
.
.
.
Job Finished in 19.622 seconds
Estimated value of Pi is 3.14280000000000000000

Ram

On Fri, Aug 19, 2016 at 11:46 AM, rammohan ganapavarapu <[hidden email]> wrote:
Ok, i have used yarn-utils.py to get the correct values for my cluster and update those properties and restarted RM and NM but still no luck not sure what i am missing, any other insights will help me.

Below are my properties from yarn-site.xml and map-site.xml.

python yarn-utils.py -c 24 -m 63 -d 3 -k False
 Using cores=24 memory=63GB disks=3 hbase=False
 Profile: cores=24 memory=63488MB reserved=1GB usableMem=62GB disks=3
 Num Container=6
 Container Ram=10240MB
 Used Ram=60GB
 Unused Ram=1GB
 yarn.scheduler.minimum-allocation-mb=10240
 yarn.scheduler.maximum-allocation-mb=61440
 yarn.nodemanager.resource.memory-mb=61440
 mapreduce.map.memory.mb=5120
 mapreduce.map.java.opts=-Xmx4096m
 mapreduce.reduce.memory.mb=10240
 mapreduce.reduce.java.opts=-Xmx8192m
 yarn.app.mapreduce.am.resource.mb=5120
 yarn.app.mapreduce.am.command-opts=-Xmx4096m
 mapreduce.task.io.sort.mb=1024


    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>10240</value>
    </property>
    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx8192m</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.resource.mb</name>
      <value>5120</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.command-opts</name>
      <value>-Xmx4096m</value>
    </property>
    <property>
      <name>mapreduce.task.io.sort.mb</name>
      <value>1024</value>
    </property>



     <property>
      <name>yarn.scheduler.minimum-allocation-mb</name>
      <value>10240</value>
    </property>

     <property>
      <name>yarn.scheduler.maximum-allocation-mb</name>
      <value>61440</value>
    </property>

     <property>
      <name>yarn.nodemanager.resource.memory-mb</name>
      <value>61440</value>
    </property>


Ram

On Thu, Aug 18, 2016 at 11:14 PM, tkg_cangkul <[hidden email]> wrote:
maybe this link can be some reference to tune up the cluster:

http://jason4zhu.blogspot.co.id/2014/10/memory-configuration-in-hadoop.html


On 19/08/16 11:13, rammohan ganapavarapu wrote:
Do you know what properties to tune?

Thanks,
Ram

On Thu, Aug 18, 2016 at 9:11 PM, tkg_cangkul <[hidden email]> wrote:
i think that's because you don't have enough resource.  u can tune your cluster config to maximize your resource.


On 19/08/16 11:03, rammohan ganapavarapu wrote:
I dont see any thing odd except this not sure if i have to worry about it or not.

2016-08-19 03:29:26,621 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
2016-08-19 03:29:27,646 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCo
untWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-08-19 03:29:28,647 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)


its keep printing this log ..in app container logs.

On Thu, Aug 18, 2016 at 8:20 PM, tkg_cangkul <[hidden email]> wrote:
maybe u can check the logs from port 8088 on your browser. that was RM UI. just choose your job id and then check the logs.
 
On 19/08/16 10:14, rammohan ganapavarapu wrote:
Sunil,

Thanks you for your input, below are my server metrics for RM. Also attached RM UI for capacity scheduler resources. How else i can find?

{
      "name": "Hadoop:service=ResourceManager,name=QueueMetrics,q0=root",
      "modelerType": "QueueMetrics,q0=root",
      "tag.Queue": "root",
      "tag.Context": "yarn",
      "tag.Hostname": "hadoop001",
      "running_0": 0,
      "running_60": 0,
      "running_300": 0,
      "running_1440": 0,
      "AppsSubmitted": 1,
      "AppsRunning": 0,
      "AppsPending": 0,
      "AppsCompleted": 0,
      "AppsKilled": 0,
      "AppsFailed": 1,
      "AllocatedMB": 0,
      "AllocatedVCores": 0,
      "AllocatedContainers": 0,
      "AggregateContainersAllocated": 2,
      "AggregateContainersReleased": 2,
      "AvailableMB": 64512,
      "AvailableVCores": 24,
      "PendingMB": 0,
      "PendingVCores": 0,
      "PendingContainers": 0,
      "ReservedMB": 0,
      "ReservedVCores": 0,
      "ReservedContainers": 0,
      "ActiveUsers": 0,
      "ActiveApplications": 0
    },

On Thu, Aug 18, 2016 at 6:49 PM, Sunil Govind <[hidden email]> wrote:
Hi

It could be because of many of reasons. Also I am not sure about which scheduler your are using, pls share more details such as RM log etc.

I could point out few reasons
 - Such as "Not enough resource is cluster" can cause this
 - If using Capacity Scheduler, if queue capacity is maxed out, such case can happen.
 - Similarly if max-am-resource-percent is crossed per queue level, then also AM container may not be launched.

you could check RM log to get more information if AM container is laucnhed.

Thanks
Sunil

On Fri, Aug 19, 2016 at 5:37 AM rammohan ganapavarapu <[hidden email]> wrote:
Hi,

When i submit a MR job, i am getting this from AM UI but it never get finished, what am i missing ?

Thanks,
Ram



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]