Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> consistency, availability and partition pattern of HBase


Copy link to this message
-
Re: consistency, availability and partition pattern of HBase
Please read the papers. You'll understand the architecture better that way.

On Aug 9, 2012, at 1:48 PM, Lin Ma <[EMAIL PROTECTED]> wrote:

Thank you Amandeep,

So I can simply understand in this way (logically), there do exist multiple
region servers for the same region, but they are working in active-passive
mode, when at one time only one active server is active? Correct?

regards,
Lin

On Thu, Aug 9, 2012 at 2:04 PM, Amandeep Khurana <[EMAIL PROTECTED]> wrote:

> Correct. You are limited to the throughput of a single region server while
> interacting with a particular region. This throughput limitation is
> typically handled by designing your keys such that your data is distributed
> well across the cluster.
> Having multiple region servers serve a single region gets you into the land
> of maintaining consistency across copies, which is challenging. It might be
> doable but that's not the design choice Bigtable (and hence HBase) made
> initially.
>
> On Thu, Aug 9, 2012 at 11:04 AM, Lin Ma <[EMAIL PROTECTED]> wrote:
>
> > Thanks
> >
> > "only a single RegionServer ever hosts a region at once" -- I know HDFS
> > have multiple copies for the same file. Is region server works in
> > active-passive way, i.e. even if there are multiple copies, only one
> region
> > server could serve? If so, will it be bottleneck, supposing the traffic
> to
> > that region is too high?
> >
> > regards,
> > Lin
> >
> > On Thu, Aug 9, 2012 at 11:09 AM, Bryan Beaudreault <
> > [EMAIL PROTECTED]
> > > wrote:
> >
> > > Actual data backing hbase is replicated, but that is handled by HDFS.
> >  Yes,
> > > if you lose an hdfs datanode, clients (in this case the client is
> hbase)
> > > move to the next node in the pipeline.
> > >
> > > However, only a single RegionServer ever hosts a region at once.  If
> the
> > > RegionServer dies, there is a period where the master must notice the
> > > regions are unhosted and move them to other regionservers.  During that
> > > period, data is inaccessible or modifiable.
> > >
> > > On Wed, Aug 8, 2012 at 10:32 PM, Lin Ma <[EMAIL PROTECTED]> wrote:
> > >
> > > > Thank you Lars.
> > > >
> > > > Is the same data store duplicated copy across region server? If so,
> if
> > > one
> > > > primary server for the region dies, client just need to read from the
> > > > secondary server for the same region. Why there is data is
> unavailable
> > > > time?
> > > >
> > > > BTW: please feel free to correct me for any wrong knowledge about
> > HBase.
> > > >
> > > > regards,
> > > > Lin
> > > >
> > > > On Thu, Aug 9, 2012 at 9:31 AM, lars hofhansl <[EMAIL PROTECTED]>
> > > wrote:
> > > >
> > > > > After a write completes the next read (regardless of the location
> it
> > is
> > > > > issued from) will see the latest value.
> > > > > This is because at any given time exactly RegionServer is
> responsible
> > > for
> > > > > a specific Key
> > > > > (through assignment of key ranges to regions and regions to
> > > > RegionServers).
> > > > >
> > > > >
> > > > > As Mohit said, the trade off is that data is unavailable if a
> > > > RegionServer
> > > > > dies until another RegionServer picks up the regions (and by
> > extension
> > > > the
> > > > > key range)
> > > > >
> > > > > -- Lars
> > > > >
> > > > >
> > > > > ----- Original Message -----
> > > > > From: Lin Ma <[EMAIL PROTECTED]>
> > > > > To: [EMAIL PROTECTED]
> > > > > Cc:
> > > > > Sent: Wednesday, August 8, 2012 8:47 AM
> > > > > Subject: Re: consistency, availability and partition pattern of
> HBase
> > > > >
> > > > > And consistency is not sacrificed? i.e. all distributed clients'
> > update
> > > > > will results in sequential / real time update? Once update is done
> by
> > > one
> > > > > client, all other client could see results immediately?
> > > > >
> > > > > regards,
> > > > > Lin
> > > > >
> > > > > On Wed, Aug 8, 2012 at 11:17 PM, Mohit Anchlia <
> > [EMAIL PROTECTED]
> > > > > >wrote:
> > > > >
> > > > > > I think availability is sacrificed in the sense that if region