Quantcast

which part of Hadoop is responsible of distributing the input file fragments to datanodes?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

which part of Hadoop is responsible of distributing the input file fragments to datanodes?

salma khalil


Hi,

I am trying to find the part of Hadoop that is responsible of distributing the input file fragments to the datanodes. I need to understand the source code that is responsible of distributing the input files.

Can anyone help me in detecting this part of code. I tried to read the namenode.java file but I could not find anything that can help me.

Thanks in advance,
Salam
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: which part of Hadoop is responsible of distributing the input file fragments to datanodes?

Harsh J-2
Assuming you speak of the HDFS file-writing code, look at DFSClient
and its utilization of DFSOutputStream (see the write(…) areas).

On Sun, Nov 11, 2012 at 4:36 PM, salmakhalil <[hidden email]> wrote:

>
>
> Hi,
>
> I am trying to find the part of Hadoop that is responsible of distributing
> the input file fragments to the datanodes. I need to understand the source
> code that is responsible of distributing the input files.
>
> Can anyone help me in detecting this part of code. I tried to read the
> namenode.java file but I could not find anything that can help me.
>
> Thanks in advance,
> Salam
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/which-part-of-Hadoop-is-responsible-of-distributing-the-input-file-fragments-to-datanodes-tp4019530.html
> Sent from the Hadoop lucene-dev mailing list archive at Nabble.com.



--
Harsh J
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: which part of Hadoop is responsible of distributing the input file fragments to datanodes?

salma khalil
What I want to do exactly is redistributing the input file fragments over the nodes of cluster according some calculations. I need to find the part that starts to distribute the input file to add my code instead of.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: which part of Hadoop is responsible of distributing the input file fragments to datanodes?

Yanbo Liang
I guess you means to set your own strategy of block distribution.
If this, just hack the code as following clue:
FSNamesystem.getAdditionalBlock() ---> BlockManager.chooseTarget()
 ---> BlockPlacementPolicy.chooseTarget().
And you need to implement your own BlockPlacementPolicy.
Then if the client request addBlock RPC, the NameNode will assign DataNode
to store the replicas as your rules.

2012/11/15 salmakhalil <[hidden email]>

> What I want to do exactly is redistributing the input file fragments over
> the
> nodes of cluster according some calculations. I need to find the part that
> starts to distribute the input file to add my code instead of.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/which-part-of-Hadoop-is-responsible-of-distributing-the-input-file-fragments-to-datanodes-tp4019530p4020330.html
> Sent from the Hadoop lucene-dev mailing list archive at Nabble.com.
>
Loading...