|
|
-
Understanding harpoon - help neededDibyendu Karmakar 2013-01-23, 09:24
Hi,
I am doing some performance testing in HADOOP. But while testing, I faced a situation. I need your help. My HADOOP cluster : 6 Datanodes, 1 Namenode, 2 Clients. Replication factor = 3 2 clients write a file(put operation) whose size is 2 x block size. DFS.DATA.DIR in each Datanodes is equal and is same as block size. That means each Datanodes stores a single block. Now, if 2 clients simultaneously reads the file( get operation), Will 2 clients read 2 blocks from different Datanodes ? Or they will read from the same datanodes? Does Namenode know which Datanode is busy and which one is idle? What I am trying to find is that... Is it possible to decrease the read time by increasing replication factor? I have attached an image to better understand my question. Kindly take a look. Thank you. And if possible please give references. |