Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Replication


Copy link to this message
-
Re: Replication
Hi,

Yes if you are purely a regular client (non DN box) writing to HDFS,
then the chosen DNs are selected at random (but fit within policy of
cross-rack writes, if it applies to your environment).

On Wed, Oct 31, 2012 at 6:43 AM, Mohit Anchlia <[EMAIL PROTECTED]> wrote:
> Thanks and if it is not the datanode then I am guessing namenode decides the
> nodes in replication pipeline?
>
>
> On Tue, Oct 30, 2012 at 5:36 PM, ranjith raghunath
> <[EMAIL PROTECTED]> wrote:
>>
>> If your client node is a datanode with your cluster then the first copy
>> does get written to that data node.
>>
>> Experts please feel free to correct me here.
>>
>> On Oct 30, 2012 7:11 PM, "Mohit Anchlia" <[EMAIL PROTECTED]> wrote:
>>>
>>> With respect to replication if I run pig job from one of the nodes within
>>> the Hadoop cluster then do I always end up with writing 1 replica copy to
>>> that client node always and remaining 2 replica copies to other nodes?
>>>
>
>

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB