Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> How to merge regions in HBase?


Copy link to this message
-
Re: How to merge regions in HBase?
Shouldn't it be possible for him to have empty regions if he has a TTL on his data?

--
Bryan Beaudreault
On Wednesday, July 18, 2012 at 9:58 AM, Kevin O'dell wrote:

> Also, depending on your version of HBase that you are running you may have
> to bring down the cluster to merge and not just the table:
>
> https://issues.apache.org/jira/browse/HBASE-1621
>
> On Tue, Jul 17, 2012 at 7:26 PM, Amandeep Khurana <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> wrote:
>
> > You shouldn't have empty regions. Using timestamp will give you
> > regions that are always half filled except the last one to which you
> > are writing the current time range. The moment that'll fill up, split
> > and you'll again be writing to the last region. How did you end up
> > with empty regions? Did you pre-split?
> >
> > On Jul 17, 2012, at 7:15 PM, Michael Segel <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])>
> > wrote:
> >
> > > Find a different row key?
> > >
> > > The problem with merging regions is that once you merge the regions, any
> > net new regions will still have the same problem. So you'll have to merge
> > again, and again and again.
> > > You're always filling to the left of the last key.
> > >
> > > In order to merge, you have to take the table offline. At least that's
> > my understanding. So its not a good thing.
> > >
> > >
> > > On Jul 17, 2012, at 11:08 AM, Ionut Ignatescu wrote:
> > >
> > > > My usecase: I have several tabels with key starting with a timestamp.
> > Also,
> > > > this tabels have set data retention to 30 days.
> > > > Table size is around 1Tb(3Tb replicated) and data is inserted regular(on
> > > > 5minute, ~200Mb is inserted).
> > > > File size is set to 1Gb. I have this tables in use for almost half an
> > > >
> > >
> >
> > year
> > > > and now a table has around 6k partitions and 40% of them are empty.
> > > > The problem: the number of regions per region server is now pretty high.
> > > > Questions:
> > > > Which approach is better?
> > > > - to merge adiacent empty partitions in a bigger one?
> > > > - to merge empty partitions to non-empty partitions?
> > > > Also, I'm wondering why regions merge is not part of major compactions
> > > >
> > >
> >
> > and
> > > > why it's neccesary to stop the
> > > > entire fleet to solve this problem.
> > > >
> > > >
> > > >
> > > > Regards,
> > > >
> > > > Ionut I.
>
>
>
> --
> Kevin O'Dell
> Customer Operations Engineer, Cloudera
>
>