For 3 replicas, the replication sequence is: 1st on local node of Writer, 2nd on remote rack node of 1st replica, 3rd on same rack of 2nd replica.
There could be some special cases like: disk is full on 1st node, or no node available for 2nd replica rack, and Hadoop already take care it well. Agree with Harsh, you should check if tasks are evenly distributed across two racks first.
----- Original Message -----
From: "Michel Segel" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED], [EMAIL PROTECTED]
Sent: Thursday, August 22, 2013 6:57:15 PM
Subject: Re: rack awarness unexpected behaviour
Rack aware is an artificial concept.
Meaning you can define where a node is regardless of is real position in the rack.
Going from memory, and its probably been changed in later versions of the code...
Isn't the replication... Copy on node 1, copy on same rack, third copy on different rack?
Or has this been improved upon?
Sent from a remote device. Please excuse any typos...
On Aug 22, 2013, at 5:14 AM, Harsh J <[EMAIL PROTECTED]> wrote:
> I'm not aware of a bug in 0.20.2 that would not honor the Rack
> Awareness, but have you done the two below checks as well?
> 1. Ensuring JT has the same rack awareness scripts and configuration
> so it can use it for scheduling, and,
> 2. Checking if the map and reduce tasks are being evenly spread across
> both racks.
> On Thu, Aug 22, 2013 at 2:50 PM, Marc Sturlese <[EMAIL PROTECTED]> wrote:
>> I'm on cdh3u4 (0.20.2), gonna try to read a bit on this bug
>> View this message in context: http://lucene.472066.n3.nabble.com/rack-awareness-unexpected-behaviour-tp4086029p4086049.html
>> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
> Harsh J