Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Understanding harpoon - help needed


+
Dibyendu Karmakar 2013-01-23, 09:24
Copy link to this message
-
Re: Understanding harpoon - help needed
Hi,

Link [1] partly answers your question. Namenode chooses the "nearest"
data-node that can cater this request. So replication definitely helps, in
the sense that a replica might be placed on a node nearer to the client.
I'm not sure whether the namenode checks if a datanode is busy serving
other requests, So I'll leave that for others to answer.

[1] http://hadoop.apache.org/docs/r0.20.2/hdfs_design.html#Replica+Selection

Thanks,
Bharath

On Wed, Jan 23, 2013 at 2:54 PM, Dibyendu Karmakar
<[EMAIL PROTECTED]>wrote:

> Hi,
> I am doing some performance testing in HADOOP. But while testing, I faced
> a situation. I need your help.
>
> My HADOOP cluster :
> 6 Datanodes, 1 Namenode, 2 Clients.
>
> Replication factor = 3
>
> 2 clients write a file(put operation) whose size is 2 x block size.
> DFS.DATA.DIR in each Datanodes is equal and is same as block size. That
> means each Datanodes stores a single block.
>
> Now, if 2 clients simultaneously reads the file( get operation),
> Will 2 clients read 2 blocks from different Datanodes ?
> Or they will read from the same datanodes?
>
> Does Namenode know which Datanode is busy and which one is idle?
>
> What I am trying to find is that...
> Is it possible to decrease the read time by increasing replication factor?
>
> I have attached an image to better understand my question. Kindly take a
> look. Thank you. And if possible please give references.
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB