|
|
-
Re: Understanding harpoon - help neededbharath vissapragada 2013-01-23, 09:44
Hi,
Link [1] partly answers your question. Namenode chooses the "nearest" data-node that can cater this request. So replication definitely helps, in the sense that a replica might be placed on a node nearer to the client. I'm not sure whether the namenode checks if a datanode is busy serving other requests, So I'll leave that for others to answer. [1] http://hadoop.apache.org/docs/r0.20.2/hdfs_design.html#Replica+Selection Thanks, Bharath On Wed, Jan 23, 2013 at 2:54 PM, Dibyendu Karmakar <[EMAIL PROTECTED]>wrote: > Hi, > I am doing some performance testing in HADOOP. But while testing, I faced > a situation. I need your help. > > My HADOOP cluster : > 6 Datanodes, 1 Namenode, 2 Clients. > > Replication factor = 3 > > 2 clients write a file(put operation) whose size is 2 x block size. > DFS.DATA.DIR in each Datanodes is equal and is same as block size. That > means each Datanodes stores a single block. > > Now, if 2 clients simultaneously reads the file( get operation), > Will 2 clients read 2 blocks from different Datanodes ? > Or they will read from the same datanodes? > > Does Namenode know which Datanode is busy and which one is idle? > > What I am trying to find is that... > Is it possible to decrease the read time by increasing replication factor? > > I have attached an image to better understand my question. Kindly take a > look. Thank you. And if possible please give references. > |