Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> HDFS performance with an without replication

Copy link to this message
RE: HDFS performance with an without replication
Thanks, that makes sense.

-----Original Message-----
From: Harsh J [mailto:[EMAIL PROTECTED]]
Sent: Sunday, September 15, 2013 12:39 PM
Subject: Re: HDFS performance with an without replication

Write performance improves with lesser replicas (as a result of synchronous and sequenced write pipelines in HDFS). Reads would be the same, unless you're unable to schedule a rack-local read (at worst
case) due to only one (busy) rack holding it.

On Sun, Sep 15, 2013 at 10:38 PM, John Lilley <[EMAIL PROTECTED]> wrote:
> In our YARN application, we are considering whether to store temporary
> data with replication=1 or replication=3 (or give the user an option).  
> Obviously there is a tradeoff between reliability and performance, but
> on smaller clusters I'd expect this to be less of an issue.
> What is the difference in write performance using replication=1 vs 3?  
> For reading I'd expect the performance to be roughly requivalent.
> john

Harsh J