Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> rack awarness unexpected behaviour


Copy link to this message
-
Re: rack awarness unexpected behaviour
Do the jobs run on the whole cluster or a single rack?
If you write from a single rack, you will get something similar to what you
described, because the default policy is to put one block locally and 2
blocks on the same remote rack. It does check that there is enough place
available, but does not try to balance.
On Thu, Aug 22, 2013 at 9:41 AM, Marc Sturlese <[EMAIL PROTECTED]>wrote:

> Hey there,
> I've set up rack awareness on my hadoop cluster with replication 3. I have
> 2
> racks and each contains 50% of the nodes.
> I can see that the blocks are spread on the 2 racks, the problem is that
> all
> nodes from a rack are storing 2 replicas and the nodes of the other rack
> just one. If I launch the hadoop balancer script, it will properly spread
> the replicas across the 2 racks, leaving all nodes with exactly the same
> available disk space but, after jobs are running for hours, the data will
> be
> unbalanced again (rack1 having all nodes with less empty disk space than
> all
> nodes from rack2)
>
> Any clue whats going on?
> Thanks in advance
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/rack-awarness-unexpected-behaviour-tp4086029.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB