Get Hadoop cluster topology

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Get Hadoop cluster topology

Diwakar Sharma
I understand that when Namenode starts up it reads fsimage to get the state of HDFS and applies the edits file to complete it.

But how about the cluster topology ? Does the namenode read the config files like core-site.xml/slaves/... etc to determine its cluster topology or uses an API to build it.


Thanks
Diwakar
Reply | Threaded
Open this post in threaded view
|

Re: Get Hadoop cluster topology

Shashwat

On Tue, Apr 16, 2013 at 11:34 PM, Diwakar Sharma <[hidden email]> wrote:
uster topology or uses an API to build it.

If you stop and start the cluster Hadoop Reads thes configuration files for sure.

           

Shashwat Shriparv

Reply | Threaded
Open this post in threaded view
|

Re: Get Hadoop cluster topology

Nikhil-2
From http://archive.cloudera.com/cdh/3/hadoop/hdfs_user_guide.html  (Assuming you are using Cloudera Hadoop Distribution 3)

$ hadoop dfsadmin -refreshNodes # would help do the same.

-refreshNodes : Updates the set of hosts allowed to connect to namenode. Re-reads the config file to update values defined by dfs.hosts and dfs.host.exclude and reads the entires (hostnames) in those files. Each entry not defined in dfs.hosts but in dfs.hosts.exclude is decommissioned. Each entry defined in dfs.hosts and also in dfs.host.exclude is stopped from decommissioning if it has aleady been marked for decommission. Entires not present in both the lists are decommissioned.

There is also -printTopology switch useful to look at the current topology view.

-printTopology : Print the topology of the cluster. Display a tree of racks and datanodes attached to the tracks as viewed by the NameNode.

In most cases, however, I have seen that updating the topology with wrong information such as rackno, tabs/spaces would get the master services in soup and in such cases, it would mandate a restart. 
I have tried looking for ways to refresh of the topology cache on both namenode/jobtracker without the need for bouncing, however this can get little tricky.




On Tue, Apr 16, 2013 at 11:39 PM, shashwat shriparv <[hidden email]> wrote:

On Tue, Apr 16, 2013 at 11:34 PM, Diwakar Sharma <[hidden email]> wrote:
uster topology or uses an API to build it.

If you stop and start the cluster Hadoop Reads thes configuration files for sure.

           

Shashwat Shriparv