And that's the rub.
Rack awareness is an artificial construct.
You want to fix it and match the real world, you need to balance the racks physically.
Otherwise you need to rewrite load balancing to take in to consideration the number and power of the nodes in the rack.
The short answer, it's easier to fudge the values in the script.
Sent from a remote device. Please excuse any typos...
> On Oct 3, 2013, at 8:58 AM, Marc Sturlese <[EMAIL PROTECTED]> wrote:
> Doing that will balance the block writing but I think here you loose the
> concept of physical rack awareness.
> Let's say you have 2 physical racks, one with 2 servers and one with 4. If
> you artificially tell hadoop that one rack has 3 servers and the other 3 you
> are loosing the concept of rack awareness. You're not guaranteeing that each
> physical rack contains at least a replica of each block.
> So if you have 2 racks with different number of servers, it's not possible
> to do proper rack awareness without filling the disks of the rack with less
> servers first. Am I right or am I missing something?
> View this message in context: http://lucene.472066.n3.nabble.com/rack-awareness-unexpected-behaviour-tp4086029p4093337.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.