Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Re: [Shadow Regions / Read Replicas ] Block Affinity


Copy link to this message
-
Re: [Shadow Regions / Read Replicas ] Block Affinity
On Tue, Dec 3, 2013 at 3:46 PM, Nick Dimiduk <[EMAIL PROTECTED]> wrote:

> On Tue, Dec 3, 2013 at 11:37 AM, Enis Söztutar <[EMAIL PROTECTED]> wrote:
>
> > I think we do not want to differentiate between RS's by splitting them
> between
> > primaries and shadows. This will complicate provisioning, administration,
> > monitoring and load balancing a lot, and will not achieve very cheap
> > secondary region promotions (because you have to move the region still as
> > you described).
> >
>
> The idea of having "primary hosts" and "replica hosts" was brought up in
> initial design discussions over here. I am particularly against this
> approach because of the additional complexity. I need to update myself on
> Enis's doc (I'm a week+ behind), but my opinion is that we treat a
> non-primary region (be it a "read replica" or a "shadow region") as a
> first-class and independent entities. These entities can be assigned to any
> host in the cluster, each with their own individual state machine
> instances.
>
> Of course, the balancer would need to be aware of the relationship between
> the primary and its non-primaries in order to maintain the balancing policy
> requirements. However, I see no reason for there to be specialization at
> the host level, and I agree with Enis's arguments against it.
>
> -n
>

I think there was a misunderstanding here -- I made a distinction between
the "normal" primary regions, eventually-consistent-read-replica/secondary
regions, and shadow memstore regions (for fast consistent read recovery).
 All region servers would be able to host normal primary regions,
read-replica regions and shadow memstore regions.

There would be different potential sweet spots if read-replica regions and
shadow memstore regions were  co-located at region on recover time with
trade offs for fast consistent recovery, ability to have more recent
values, locality optimizations and load balancing optimizations.

Jon.

--
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// [EMAIL PROTECTED]
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB