Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> HDFS performance with an without replication


Copy link to this message
-
RE: HDFS performance with an without replication
Thanks, that makes sense.
john

-----Original Message-----
From: Harsh J [mailto:[EMAIL PROTECTED]]
Sent: Sunday, September 15, 2013 12:39 PM
To: <[EMAIL PROTECTED]>
Subject: Re: HDFS performance with an without replication

Write performance improves with lesser replicas (as a result of synchronous and sequenced write pipelines in HDFS). Reads would be the same, unless you're unable to schedule a rack-local read (at worst
case) due to only one (busy) rack holding it.

On Sun, Sep 15, 2013 at 10:38 PM, John Lilley <[EMAIL PROTECTED]> wrote:
> In our YARN application, we are considering whether to store temporary
> data with replication=1 or replication=3 (or give the user an option).  
> Obviously there is a tradeoff between reliability and performance, but
> on smaller clusters I'd expect this to be less of an issue.
>
>
>
> What is the difference in write performance using replication=1 vs 3?  
> For reading I'd expect the performance to be roughly requivalent.
>
>
>
> john

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB