Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - Re: Who splits the file into blocks

Copy link to this message
Re: Who splits the file into blocks
Rahul Bhattacharjee 2013-03-31, 16:46
I think what Sai was asking is when client asks namenode to give it a list
of data nodes then how does the namenode knows as how many blocks would be
required to store the entire file.

I think the way it works is client requests the NN for list of blocks and
then the client writes the first block in the nodes what the NN has
specified and then it again requests the NN for another set of blocks and
so on.Client would know when the EOF is reached.

Jens has mentioned the way NN decides where to allocate the block.I mean in
which DN's the blocks are to be written.

On Sun, Mar 31, 2013 at 10:00 PM, Jens Scheidtmann <

> Dear Sai Sai,
> "Hadoop, the definitive guide" says regarding default replica placement:
> - first replica is placed on the same node as the client (lowest bandwidth
> penalty).
> - second replica is placed off-rack, at a random node of the other rack
> (avoiding busy racks).
> - third replicate is placed on random node on rack where second replica is
> stored.
> - other replicas are placed on random nodes of the cluster (avoiding busy
> racks).
> If client is not on the cluster, first replica is placed on a random node
> (avoiding busy racks).
> Best regards,
> Jens