Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> consistency, availability and partition pattern of HBase


Copy link to this message
-
Re: consistency, availability and partition pattern of HBase
Correct. You are limited to the throughput of a single region server while
interacting with a particular region. This throughput limitation is
typically handled by designing your keys such that your data is distributed
well across the cluster.
Having multiple region servers serve a single region gets you into the land
of maintaining consistency across copies, which is challenging. It might be
doable but that's not the design choice Bigtable (and hence HBase) made
initially.

On Thu, Aug 9, 2012 at 11:04 AM, Lin Ma <[EMAIL PROTECTED]> wrote:

> Thanks
>
> "only a single RegionServer ever hosts a region at once" -- I know HDFS
> have multiple copies for the same file. Is region server works in
> active-passive way, i.e. even if there are multiple copies, only one region
> server could serve? If so, will it be bottleneck, supposing the traffic to
> that region is too high?
>
> regards,
> Lin
>
> On Thu, Aug 9, 2012 at 11:09 AM, Bryan Beaudreault <
> [EMAIL PROTECTED]
> > wrote:
>
> > Actual data backing hbase is replicated, but that is handled by HDFS.
>  Yes,
> > if you lose an hdfs datanode, clients (in this case the client is hbase)
> > move to the next node in the pipeline.
> >
> > However, only a single RegionServer ever hosts a region at once.  If the
> > RegionServer dies, there is a period where the master must notice the
> > regions are unhosted and move them to other regionservers.  During that
> > period, data is inaccessible or modifiable.
> >
> > On Wed, Aug 8, 2012 at 10:32 PM, Lin Ma <[EMAIL PROTECTED]> wrote:
> >
> > > Thank you Lars.
> > >
> > > Is the same data store duplicated copy across region server? If so, if
> > one
> > > primary server for the region dies, client just need to read from the
> > > secondary server for the same region. Why there is data is unavailable
> > > time?
> > >
> > > BTW: please feel free to correct me for any wrong knowledge about
> HBase.
> > >
> > > regards,
> > > Lin
> > >
> > > On Thu, Aug 9, 2012 at 9:31 AM, lars hofhansl <[EMAIL PROTECTED]>
> > wrote:
> > >
> > > > After a write completes the next read (regardless of the location it
> is
> > > > issued from) will see the latest value.
> > > > This is because at any given time exactly RegionServer is responsible
> > for
> > > > a specific Key
> > > > (through assignment of key ranges to regions and regions to
> > > RegionServers).
> > > >
> > > >
> > > > As Mohit said, the trade off is that data is unavailable if a
> > > RegionServer
> > > > dies until another RegionServer picks up the regions (and by
> extension
> > > the
> > > > key range)
> > > >
> > > > -- Lars
> > > >
> > > >
> > > > ----- Original Message -----
> > > > From: Lin Ma <[EMAIL PROTECTED]>
> > > > To: [EMAIL PROTECTED]
> > > > Cc:
> > > > Sent: Wednesday, August 8, 2012 8:47 AM
> > > > Subject: Re: consistency, availability and partition pattern of HBase
> > > >
> > > > And consistency is not sacrificed? i.e. all distributed clients'
> update
> > > > will results in sequential / real time update? Once update is done by
> > one
> > > > client, all other client could see results immediately?
> > > >
> > > > regards,
> > > > Lin
> > > >
> > > > On Wed, Aug 8, 2012 at 11:17 PM, Mohit Anchlia <
> [EMAIL PROTECTED]
> > > > >wrote:
> > > >
> > > > > I think availability is sacrificed in the sense that if region
> server
> > > > > fails clients will have data inaccessible for the time region comes
> > up
> > > on
> > > > > some other server, not to confuse with data loss.
> > > > >
> > > > > Sent from my iPad
> > > > >
> > > > > On Aug 7, 2012, at 11:56 PM, Lin Ma <[EMAIL PROTECTED]> wrote:
> > > > >
> > > > > > Thank you Wei!
> > > > > >
> > > > > > Two more comments,
> > > > > >
> > > > > > 1. How about Hadoop's CAP characters do you think about?
> > > > > > 2. For your comments, if HBase implements "per key sequential
> > > > > consistency",
> > > > > > what are the missing characters for consistency? Cross-key update
> > > > > > sequences? Could you show me an example about what you think are
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB