Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> how client location a region/tablet?


Copy link to this message
-
RE: how client location a region/tablet?
I too thought there are multiple meta regions where as just one ROOT.  May be I am mixing b/w Big Table and Hbase.

Thanks,
Abhishek
-----Original Message-----
From: Lin Ma [mailto:[EMAIL PROTECTED]]
Sent: Thursday, August 23, 2012 9:41 AM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: how client location a region/tablet?

Thanks, Harsh!

- "HBase currently keeps a single META region (Doesn't split it). " -- does it mean there is only one row in ROOT table, which points the only one META region?
- In Big Table, it seems they have multiple META regions (tablets), is it an advantage over HBase? :-)

regards,
Lin
On Thu, Aug 23, 2012 at 11:48 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> HBase currently keeps a single META region (Doesn't split it). ROOT
> holds META region location, and META has a few rows in it, a few of
> them for each table. See also the class MetaScanner.
>
> On Thu, Aug 23, 2012 at 9:00 PM, Lin Ma <[EMAIL PROTECTED]> wrote:
> > Dong,
> >
> > Some more thoughts, after reading data structure for HRegionInfo =>
> > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.
> > html
> ,
> > start key and end key looks informative which we could leverage,
> >
> > - I am not sure if we could leverage this information (stored as
> > part of value in table ROOT) to find which META region may contains
> > region server information for row-key 123 of data table ABC;
> > - But I think unfortunately the information is stored in value of
> > table ROOT, other than key field of table ROOT, so that we have to
> > iterate each row in ROOT table one by one to figure out which META
> > region server to access.
> >
> > Not sure if I get the points. Please feel free to correct me.
> >
> > regards,
> > Lin
> >
> > On Thu, Aug 23, 2012 at 11:15 PM, Lin Ma <[EMAIL PROTECTED]> wrote:
> >
> >> Doug, very informative document. Thanks a lot!
> >>
> >> I read through it and have some thoughts,
> >>
> >> - Supposing at the beginning, client side cache for region
> >> information
> is
> >> empty, and the client wants to GET row-key 123 from table ABC;
> >> - The client will read from ROOT table at first. But unfortunately,
> >> ROOT table only contains region information for META table (please
> >> correct
> me if
> >> I am wrong), but not region information for real data table (e.g.
> >> table ABC);
> >> - Does the client have to call each META region server one by one,
> >> in order to find which META region contains information for region
> >> owner of row-key 123 of data table ABC?
> >>
> >> BTW: I think if there is a way to expose information about what
> >> range of table/region each META region contains from .META. region
> >> key, it will
> be
> >> better to save time to iterate META region server one by one.
> >> Please
> feel
> >> free to correct me if I am wrong.
> >>
> >> regards,
> >> Lin
> >>
> >>
> >> On Thu, Aug 23, 2012 at 8:21 PM, Doug Meil <
> [EMAIL PROTECTED]>wrote:
> >>
> >>>
> >>> For further information about the catalog tables and
> region-regionserver
> >>> assignment, see thisŠ
> >>>
> >>> http://hbase.apache.org/book.html#arch.catalog
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On 8/19/12 7:36 AM, "Lin Ma" <[EMAIL PROTECTED]> wrote:
> >>>
> >>> >Thank you Stack, especially for the smart 6 round trip guess for
> >>> >the puzzle. :-)
> >>> >
> >>> >1. "Yeah, we client cache's locations, not the data." -- does it
> >>> >mean
> for
> >>> >each client, it will cache all location information of a HBase
> cluster,
> >>> >i.e. which physical server owns which region? Supposing each
> >>> >region
> has
> >>> >128M bytes, for a big cluster (P-bytes level), total data size /
> >>> >128M
> is
> >>> >not a trivial number, not sure if any overhead to client?
> >>> >2. A bit confused by what do you mean "not the data"? For the
> >>> >client cached location information, it should be the data in
> >>> >table METADATA, which
> is
> >>> >region / physical server mapping data. Why you say not data (do