Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # dev - Multiple RS for serving one region


+
Devaraj Das 2013-01-21, 23:48
+
Jonathan Hsieh 2013-01-22, 00:55
Copy link to this message
-
Re: Multiple RS for serving one region
Otis Gospodnetic 2013-01-22, 16:49
This sounds hugely useful to me and is one of those "why doesn't HBase have
that" things that bugged me.

Is there an issue to watch?

http://search-hadoop.com/?q=region+failover+secondary&fc_project=HBase&fc_type=issuedoesn't
find any.

Thanks,
Otis
--
HBASE Performance Monitoring - http://sematext.com/spm/index.html

On Mon, Jan 21, 2013 at 7:55 PM, Jonathan Hsieh <[EMAIL PROTECTED]> wrote:

> The main motivation is to maintain good performance on RS failovers.
> This is also tied with hdfs and its block placement policy.  Let me
> explain as I understand it.   If we control the hdfs block placement
> strategy we can write all blocks for a hfile (or for all hfiles
> related to a region) to the same set of data nodes.  If the RS fails,
> they favor failover to a node that has a local copy of all the blocks.
>
> Today, when you write an hfile to hdfs, for each block the first
> replica goes to the local data node but the others get disbursed
> around the cluster randomly at a per block granularity. The problem
> here is that if the rs fails, the new rs that gets the responsibility
> for the region has to read files that are spread all over the cluster
> and with roughly 1/nth of the data local.  This means that the
> recovered region is slower until a compaction localizes the data gain.
>
> They've gone in and modified hdfs and their hbase to take advantage of
> this idea.  I believe the randomization policy is enforced per region
> -- if an rs serves 25 region, all the files within a each region are
> sent to the same set of secondary/tertiary nodes, but each region
> sends to a different set of secondary/tertiary nodes.
>
> Jon.
>
>
> On Mon, Jan 21, 2013 at 3:48 PM, Devaraj Das <[EMAIL PROTECTED]> wrote:
> > In 0.89-fb branch I stumbled upon stuff that indicated that there is a
> > concept of secondary and tertiary regionserver. Could someone with
> > more insights please shed some light on this?
> > Might be useful to do the analysis on whether it makes sense for trunk..
> > Thanks
> > Devaraj
>
>
>
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
> // [EMAIL PROTECTED]
>
+
Ted Yu 2013-01-22, 18:31
+
Devaraj Das 2013-01-22, 18:38
+
Devaraj Das 2013-01-22, 21:58
+
Liu, Raymond 2013-01-24, 01:11