Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Understanding harpoon - help needed


Copy link to this message
-
Understanding harpoon - help needed
Hi,
I am doing some performance testing in HADOOP. But while testing, I faced a
situation. I need your help.

My HADOOP cluster :
6 Datanodes, 1 Namenode, 2 Clients.

Replication factor = 3

2 clients write a file(put operation) whose size is 2 x block size.
DFS.DATA.DIR in each Datanodes is equal and is same as block size. That
means each Datanodes stores a single block.

Now, if 2 clients simultaneously reads the file( get operation),
Will 2 clients read 2 blocks from different Datanodes ?
Or they will read from the same datanodes?

Does Namenode know which Datanode is busy and which one is idle?

What I am trying to find is that...
Is it possible to decrease the read time by increasing replication factor?

I have attached an image to better understand my question. Kindly take a
look. Thank you. And if possible please give references.
+
bharath vissapragada 2013-01-23, 09:44
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB