Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Get the list of store/store files for a region via HBase API


Copy link to this message
-
RE: Get the list of store/store files for a region via HBase API
Espinoza,Carlos 2012-05-16, 20:07
The way we've done this in one of our tools is to get the list of
regions from .META. then filter them by the tables we want. We then
figure the path all the way up to the column family

colfamilyPath = /table/region/family/

Then we do fs.getFileStatus(colfamilyPath) and get the individual list
of files. We repeat this for all the regions that we're interested in

https://github.com/oclc/HBase-Backup

Feel free to look at
getHBaseRegions() in
src/main/java/org/oclc/firefly/hadoop/backup/Backup.java

and

getListOfRegionFiles() in
src/main/java/org/oclc/firefly/hadoop/backup/BackupUtils.java

-----Original Message-----
From: Chen Song [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, May 16, 2012 12:22 PM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: Get the list of store/store files for a region via HBase
API

In HBase API, there are classes defined on each level of structs. For
example, HRegionInterface, HRegionInfo (HRegion), Store, StoreFile. I am
not sure why there doesn't seem to have a clear way to traverse this
hierarchical structure.

I can think of some downsides of using HDFS to get such information.

1. The user has to rebuild and maintain a similar hierarchy of struct
classes on their end from the HDFS structure. For simple use case such
as
getting counts, it is probably OK but it could enforce the user to
replicate the similar amount of work (which has been done on HBase API)
for
use cases that need more complicated meta information.
2. If a region is removed from meta store, there exists a timing between
the time it is removed from meta store and that it is physically removed
from HDFS. That said, it is not reliable.

Chen

On Tue, May 15, 2012 at 10:06 PM, Doug Meil
<[EMAIL PROTECTED]>wrote:

>
> You're just doing a directory listing in HDFS to get this information.
> That's a pretty lightweight operation (I.e., as opposed to
transferring
> the contents of all the StoreFiles, etc.)
>
> If you want to go all the way down to the StoreFile it's the only way
I'm
> aware of at this time.
>
>
>
>
> On 5/15/12 5:37 PM, "Chen Song" <[EMAIL PROTECTED]> wrote:
>
> >Thanks Doug, that should work as the hierarchy is explicitly
reflected on
> >HDFS. But is this the preferred way to do such table/region/storefile
> >traversal?
> >
> >I would like to avoid hitting to HDFS directly if possible as these
pieces
> >of meta information is within HBase world.
> >
> >Thanks
> >Chen
> >
> >On Tue, May 15, 2012 at 5:23 PM, Doug Meil
> ><[EMAIL PROTECTED]>wrote:
> >
> >>
> >> You can get the Table->Region->StoreFile information via HDFS.
That is
> >> described here in the RefGuide:
> >>
> >> http://hbase.apache.org/book.html#trouble.namenode
> >>
> >>
> >>
> >>
> >>
> >>
> >> On 5/15/12 5:09 PM, "Chen Song" <[EMAIL PROTECTED]> wrote:
> >>
> >> >I am new to HBase and started working on a project which needs
meta
> >> >information on HBase regions for a table. The version of HBase I
am
> >>using
> >> >0.90.4.
> >> >
> >> >The use case is very simple.
> >> >
> >> >First, I want to get all regions for a table, which I can achieve
using
> >> >the
> >> >API call below.
> >> >
> >> >    HTable table = new HTable(conf, tableName);
> >> >
> >> >    Map<HRegionInfo, HServerAddress> regions table.getRegionsInfo();
> >> >
> >> >
> >> >Second, I want to get the list of stores (and then store files)
for
> >>each
> >> >region. This is where I stuck as I could not find a way to do it
by
> >> >searching in the API. It seems that HRegionInterface started
supporting
> >> >API
> >> >call to retrive the list of store files since 0.95-SNAPSHOT but I
don't
> >> >want to upgrade my HBase version. Below is how to get the
corresponding
> >> >HResionInterface.
> >> >
> >> >    HConnection connection = table.getConnection();
> >> >
> >> >    HRegionInterface regionInterface > >> >connection.getHRegionConnection(regionAddress);
> >> >
> >> >The series of objects I would like to get in sequence is like:
> >>HRegionInfo
that
Chen Song
Mobile: 518-445-5096