Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - Multiple RS for serving one region


Copy link to this message
-
Re: Multiple RS for serving one region
Ted Yu 2013-01-22, 18:31
The feature depends on hdfs support.
Once we have that, we can implement this feature in HBase.

Cheers

On Tue, Jan 22, 2013 at 8:49 AM, Otis Gospodnetic <
[EMAIL PROTECTED]> wrote:

> This sounds hugely useful to me and is one of those "why doesn't HBase have
> that" things that bugged me.
>
> Is there an issue to watch?
>
>
> http://search-hadoop.com/?q=region+failover+secondary&fc_project=HBase&fc_type=issuedoesn't
> find any.
>
> Thanks,
> Otis
> --
> HBASE Performance Monitoring - http://sematext.com/spm/index.html
>
>
>
> On Mon, Jan 21, 2013 at 7:55 PM, Jonathan Hsieh <[EMAIL PROTECTED]> wrote:
>
> > The main motivation is to maintain good performance on RS failovers.
> > This is also tied with hdfs and its block placement policy.  Let me
> > explain as I understand it.   If we control the hdfs block placement
> > strategy we can write all blocks for a hfile (or for all hfiles
> > related to a region) to the same set of data nodes.  If the RS fails,
> > they favor failover to a node that has a local copy of all the blocks.
> >
> > Today, when you write an hfile to hdfs, for each block the first
> > replica goes to the local data node but the others get disbursed
> > around the cluster randomly at a per block granularity. The problem
> > here is that if the rs fails, the new rs that gets the responsibility
> > for the region has to read files that are spread all over the cluster
> > and with roughly 1/nth of the data local.  This means that the
> > recovered region is slower until a compaction localizes the data gain.
> >
> > They've gone in and modified hdfs and their hbase to take advantage of
> > this idea.  I believe the randomization policy is enforced per region
> > -- if an rs serves 25 region, all the files within a each region are
> > sent to the same set of secondary/tertiary nodes, but each region
> > sends to a different set of secondary/tertiary nodes.
> >
> > Jon.
> >
> >
> > On Mon, Jan 21, 2013 at 3:48 PM, Devaraj Das <[EMAIL PROTECTED]>
> wrote:
> > > In 0.89-fb branch I stumbled upon stuff that indicated that there is a
> > > concept of secondary and tertiary regionserver. Could someone with
> > > more insights please shed some light on this?
> > > Might be useful to do the analysis on whether it makes sense for
> trunk..
> > > Thanks
> > > Devaraj
> >
> >
> >
> > --
> > // Jonathan Hsieh (shay)
> > // Software Engineer, Cloudera
> > // [EMAIL PROTECTED]
> >
>