How to Speed Up Decommissioning progress of a datanode.

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

How to Speed Up Decommissioning progress of a datanode.

sravankumar
Hi,

 

            Does any one know how to speed up datanode decommissioning and
what are all the configurations

related to the decommissioning.

            How to Speed Up Data Transfer from the Datanode getting
decommissioned.

 

Thanks & Regards,

Sravan kumar.

Reply | Threaded
Open this post in threaded view
|

Re: How to Speed Up Decommissioning progress of a datanode.

Adarsh Sharma
sravankumar wrote:

> Hi,
>
>  
>
>             Does any one know how to speed up datanode decommissioning and
> what are all the configurations
>
> related to the decommissioning.
>
>             How to Speed Up Data Transfer from the Datanode getting
> decommissioned.
>
>  
>
> Thanks & Regards,
>
> Sravan kumar.
>
>
> Check the attachment
--Adarsh


Balancing Data among Datanodes : HDFS will not move blocks to new nodes automatically. However, newly created files will likely have their blocks placed on the new nodes.


There are several ways to rebalance the cluster manually.


-Select a subset of files that take up a good percentage of your disk space; copy them to new locations in HDFS; remove the old copies of the files; rename the new copies to their original names.

-A simpler way, with no interruption of service, is to turn up the replication of files, wait for transfers to stabilize, and then turn the replication back down.
-Yet another way to re-balance blocks is to turn off the data-node, which is full, wait until its blocks are replicated, and then bring it back again. The over-replicated blocks will be randomly removed from different nodes, so you really get them rebalanced not just removed from the current node.

-Finally, you can use the bin/start-balancer.sh command to run a balancing process to move blocks around the cluster automatically.


bash-3.2$ bin/start-balancer.sh
or

$ bin/hadoop balancer -threshold 10

starting balancer, logging to /home/hadoop/project/hadoop-0.20.2/bin/../logs/hadoop-hadoop-balancer-ws-test.out

Time Stamp               Iteration#  Bytes Already Moved  Bytes Left To Move  Bytes Being Moved

The cluster is balanced. Exiting...
Balancing took 350.0 milliseconds

A cluster is balanced iff there is no under-capactiy or over-capacity data nodes in the cluster.
An under-capacity data node is a node that its %used space is less than avg_%used_space-threshhold.
An over-capacity data node is a node that its %used space is greater than avg_%used_space+threshhold.
A threshold is user configurable. A default value could be 20% of % used space.
Reply | Threaded
Open this post in threaded view
|

Re: How to Speed Up Decommissioning progress of a datanode.

baggio liu
In reply to this post by sravankumar
You can use metasave to check the bottleneck of decommion speed,
If the bottleneck is the speed of namenode dispatch. You can tuning
dfs.max-repl-streams to a large number (default 2).
If there're  many timeout block replication tasks from pending replication
queue to need replication , you can tuning
dfs.replication.pending.timeout.sec to a smaller numer, to make block
replcation more positive.

Pay attention!!  Please check your hadoop version, if block transfer has no
speed limit, the bandwidth may be stuff full


Thanks & Best regards
Baggio


2010/12/16 sravankumar <[hidden email]>

> Hi,
>
>
>
>            Does any one know how to speed up datanode decommissioning and
> what are all the configurations
>
> related to the decommissioning.
>
>            How to Speed Up Data Transfer from the Datanode getting
> decommissioned.
>
>
>
> Thanks & Regards,
>
> Sravan kumar.
>
>