Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> RE: Load balancing HDFS


Copy link to this message
-
RE: Load balancing HDFS
Default blockplacement policy will check the remaining space like following.

If the remaining space in that node is greater than blksize*MIN_BLKS_FOR_WRITE (default 5) , then it will treat that node as good.

I think the option may be is to run the balancer to move the blocks based on DN utilization, in-between after some jobs completed... I am not sure this can work with your requirements.

Regards,

Uma

________________________________
From: Lior Schachter [[EMAIL PROTECTED]]
Sent: Wednesday, November 30, 2011 5:55 PM
To: [EMAIL PROTECTED]
Subject: Load balancing HDFS

Hi all,
We currently have a 10 nodes cluster with 6TB per machine.
We are buying few more nodes and considering to have only 3TB per machine.

By default HDFS assigns blocks according to used capacity, percentage wise.
This means that old nodes will contain more data.
We prefer that the nodes (6TB, 3TB) will be balanced by actual used space so M/R jobs will work better.
We don't expect to exceed the 3TB limit (buy more machines).

Thanks,
Lior

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB