Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Fixing badly distributed table manually.


Copy link to this message
-
Fixing badly distributed table manually.
Hello,

A couple of questions regarding balancing of a table's data in HBase.

a) What is the easiest way to get an overview of how a table is distributed
across regions of a cluster? I guess I could search .META. but I haven't
figured out how to use filters from shell.
b) What constitutes a "badly distributed" table and how can I re-balance
manually?
c) Is b) needed at all? I know that HBase does its balancing automatically
behind the scenes.

As for a) I tried running this script:

https://github.com/Mendeley/hbase-scripts/blob/master/list_regions.rb

like so:

hbase org.jruby.Main ./list_regions.rb <_my_table>

but I get

ArgumentError: wrong number of arguments (1 for 2)
  (root) at ./list_regions.rb:60

If someone more proficient notices an obvious fix, I'd be glad to hear
about it.

Why do I ask? I have the impression that one of the tables on our HBase
cluster is not well distributed. When running a Map Reduce job on this
table, the load average on a single node is very high, whereas all other
nodes are almost idling. It is the only table where this behavior is
observed. Other Map Reduce jobs result in slightly elevated load averages
on several machines.

Thank you,

/David
+
Pablo Musa 2012-09-04, 15:20
+
Ted Yu 2012-09-04, 15:12
+
Guillaume Gardey 2012-09-04, 15:34
+
David Koch 2012-09-04, 21:42
+
Vincent Barat 2012-09-05, 06:37
+
Vincent Barat 2012-09-05, 13:19
+
David Koch 2012-09-05, 08:40
+
Ivan Balashov 2012-12-24, 16:27
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB