Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> how client location a region/tablet?


Copy link to this message
-
RE: how client location a region/tablet?
I too thought there are multiple meta regions where as just one ROOT.  May be I am mixing b/w Big Table and Hbase.

Thanks,
Abhishek
-----Original Message-----
From: Lin Ma [mailto:[EMAIL PROTECTED]]
Sent: Thursday, August 23, 2012 9:41 AM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: how client location a region/tablet?

Thanks, Harsh!

- "HBase currently keeps a single META region (Doesn't split it). " -- does it mean there is only one row in ROOT table, which points the only one META region?
- In Big Table, it seems they have multiple META regions (tablets), is it an advantage over HBase? :-)

regards,
Lin
On Thu, Aug 23, 2012 at 11:48 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> HBase currently keeps a single META region (Doesn't split it). ROOT
> holds META region location, and META has a few rows in it, a few of
> them for each table. See also the class MetaScanner.
>
> On Thu, Aug 23, 2012 at 9:00 PM, Lin Ma <[EMAIL PROTECTED]> wrote:
> > Dong,
> >
> > Some more thoughts, after reading data structure for HRegionInfo =>
> > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.
> > html
> ,
> > start key and end key looks informative which we could leverage,
> >
> > - I am not sure if we could leverage this information (stored as
> > part of value in table ROOT) to find which META region may contains
> > region server information for row-key 123 of data table ABC;
> > - But I think unfortunately the information is stored in value of
> > table ROOT, other than key field of table ROOT, so that we have to
> > iterate each row in ROOT table one by one to figure out which META
> > region server to access.
> >
> > Not sure if I get the points. Please feel free to correct me.
> >
> > regards,
> > Lin
> >
> > On Thu, Aug 23, 2012 at 11:15 PM, Lin Ma <[EMAIL PROTECTED]> wrote:
> >
> >> Doug, very informative document. Thanks a lot!
> >>
> >> I read through it and have some thoughts,
> >>
> >> - Supposing at the beginning, client side cache for region
> >> information
> is
> >> empty, and the client wants to GET row-key 123 from table ABC;
> >> - The client will read from ROOT table at first. But unfortunately,
> >> ROOT table only contains region information for META table (please
> >> correct
> me if
> >> I am wrong), but not region information for real data table (e.g.
> >> table ABC);
> >> - Does the client have to call each META region server one by one,
> >> in order to find which META region contains information for region
> >> owner of row-key 123 of data table ABC?
> >>
> >> BTW: I think if there is a way to expose information about what
> >> range of table/region each META region contains from .META. region
> >> key, it will
> be
> >> better to save time to iterate META region server one by one.
> >> Please
> feel
> >> free to correct me if I am wrong.
> >>
> >> regards,
> >> Lin
> >>
> >>
> >> On Thu, Aug 23, 2012 at 8:21 PM, Doug Meil <
> [EMAIL PROTECTED]>wrote:
> >>
> >>>
> >>> For further information about the catalog tables and
> region-regionserver
> >>> assignment, see thisŠ
> >>>
> >>> http://hbase.apache.org/book.html#arch.catalog
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On 8/19/12 7:36 AM, "Lin Ma" <[EMAIL PROTECTED]> wrote:
> >>>
> >>> >Thank you Stack, especially for the smart 6 round trip guess for
> >>> >the puzzle. :-)
> >>> >
> >>> >1. "Yeah, we client cache's locations, not the data." -- does it
> >>> >mean
> for
> >>> >each client, it will cache all location information of a HBase
> cluster,
> >>> >i.e. which physical server owns which region? Supposing each
> >>> >region
> has
> >>> >128M bytes, for a big cluster (P-bytes level), total data size /
> >>> >128M
> is
> >>> >not a trivial number, not sure if any overhead to client?
> >>> >2. A bit confused by what do you mean "not the data"? For the
> >>> >client cached location information, it should be the data in
> >>> >table METADATA, which
> is
> >>> >region / physical server mapping data. Why you say not data (do
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB