Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Altering logical rack configuration


+
Jonathan Disher 2011-02-05, 01:00
Copy link to this message
-
Re: Altering logical rack configuration

On Feb 4, 2011, at 5:00 PM, Jonathan Disher wrote:

> In my existing 48-node cluster, the engineer who originally designed it (no longer here) did not specify logical racks in the HDFS configuration, instead leaving everything in "default-rack".  Now I have 4 physical racks of machines, and I am becoming concerned about failure and near/far replication issues.
>
> Anyone have any ideas what will happen if I tell hadoop about the physical rack layout (i.e. nuke default-rack, and create rack110, rack111, rack112, rack113, etc)?
What will happen is that your next fsck run will be full of replication policy violation errors/warnings (I forget which).  These are non-fatal and trivial to fix if you have enough space:

a) use setrep to increase the replication factor of all files
b) let the namenode re-replicate all blocks
c) use setrep again to decrease the replication factor of all files
d) let the namenode remove an extra replica:  it will almost always choose a replica that violates policy

Is it worth fixing? Yes, definitely, for the reasons you suspect:  if you lose a rack, you might lose data.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB