Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Re: Fixing badly distributed table manually.


+
Ivan Balashov 2012-12-24, 16:27
Copy link to this message
-
Re: Fixing badly distributed table manually.
On Mon, Dec 24, 2012 at 8:27 AM, Ivan Balashov <[EMAIL PROTECTED]> wrote:

>
> Vincent Barat <vbarat@...> writes:
>
> >
> > Hi,
> >
> > Balancing regions between RS is correctly handled by HBase : I mean
> > that your RSs always manage the same number of regions (the balancer
> > takes care of it).
> >
> > Unfortunately, balancing all the regions of one particular table
> > between the RS of your cluster is not always easy, since HBase (as
> > for 0.90.3) when it comes to splitting a region, create the new one
> > always on the same RS. This means that if you start with a 1 region
> > only table, and then you insert lots of data into it, new regions
> > will always be created to the same RS (if you insert is a M/R job,
> > you saturate this RS). Eventually, the balancer at a time will
> > decide to balance one of these regions to other RS, limiting the
> > issue, but it is not controllable.
> >
> > Here at Capptain, we solved this problem by developing a special
> > Python script, based on the HBase shell, allowing to entirely
> > balance all the regions of all tables to all RS. It ensure that
> > regions of tables are uniformly deployed on all RS of the cluster,
> > with a minimum region transitions.
> >
>

Is it possible to describe the logic at high level on what you did?

> > It is fast, and even if it can trigger a lot of region transitions,
> > there is very few impact at runtime and it can be run safely.
> >
> > If you are interested, just let me know, I can share it.
> >
> > Regards,
> >
>
> Vincent,
>
> I would much like to see and possibly use the script that you
> mentioned. We've just run  into the same issue (after the table
> has been truncated it was re-created with only 1 region, and
> after data loading and manual splits we ended up having all
> regions within the same RS).
>
> If you could share the script, it will be really appreciated,
> I believe not only by me.
>
> Thanks,
> Ivan
>
>
>
>
>
>
>
+
anil gupta 2012-12-24, 20:23
+
Vincent Barat 2013-04-10, 16:31