Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Re: Fixing badly distributed table manually.


Copy link to this message
-
Re: Fixing badly distributed table manually.
Hi Vincent,

I dont know python but i am interested in learning about your solution. It
would be great If you could also share the logic for balancing the cluster.

Thanks,
Anil Gupta

On Mon, Dec 24, 2012 at 9:53 AM, Mohit Anchlia <[EMAIL PROTECTED]>wrote:

> On Mon, Dec 24, 2012 at 8:27 AM, Ivan Balashov <[EMAIL PROTECTED]>
> wrote:
>
> >
> > Vincent Barat <vbarat@...> writes:
> >
> > >
> > > Hi,
> > >
> > > Balancing regions between RS is correctly handled by HBase : I mean
> > > that your RSs always manage the same number of regions (the balancer
> > > takes care of it).
> > >
> > > Unfortunately, balancing all the regions of one particular table
> > > between the RS of your cluster is not always easy, since HBase (as
> > > for 0.90.3) when it comes to splitting a region, create the new one
> > > always on the same RS. This means that if you start with a 1 region
> > > only table, and then you insert lots of data into it, new regions
> > > will always be created to the same RS (if you insert is a M/R job,
> > > you saturate this RS). Eventually, the balancer at a time will
> > > decide to balance one of these regions to other RS, limiting the
> > > issue, but it is not controllable.
> > >
> > > Here at Capptain, we solved this problem by developing a special
> > > Python script, based on the HBase shell, allowing to entirely
> > > balance all the regions of all tables to all RS. It ensure that
> > > regions of tables are uniformly deployed on all RS of the cluster,
> > > with a minimum region transitions.
> > >
> >
>
> Is it possible to describe the logic at high level on what you did?
>
> > > It is fast, and even if it can trigger a lot of region transitions,
> > > there is very few impact at runtime and it can be run safely.
> > >
> > > If you are interested, just let me know, I can share it.
> > >
> > > Regards,
> > >
> >
> > Vincent,
> >
> > I would much like to see and possibly use the script that you
> > mentioned. We've just run  into the same issue (after the table
> > has been truncated it was re-created with only 1 region, and
> > after data loading and manual splits we ended up having all
> > regions within the same RS).
> >
> > If you could share the script, it will be really appreciated,
> > I believe not only by me.
> >
> > Thanks,
> > Ivan
> >
> >
> >
> >
> >
> >
> >
>

--
Thanks & Regards,
Anil Gupta