Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - HRegion for HRegionInfo?


Copy link to this message
-
Re: HRegion for HRegionInfo?
Jean-Marc Spaggiari 2013-03-29, 12:11
Hi Sean, thanks for the suggestion. I will take a look that way too.

Hi Enis,

I agree that they are very different. So far I'm using
getClusterStatus, getServers and getLoad but have to load ALL the
regions for ALL the servers even if I just want to get one table. And
also, to "rebuild" the regions order, I need to call that on all the
servers first. Which might take a while for very big clusters with
very big tables. For me (8 RS, 60 regions), it's efficient, but I have
no idea how long it's going to take to call 1000 times HServerLoad
load = status.getLoad(server). I will try to see how long one call is
taking to see if it's efficient.

The inital idea was to scan the Meta where I can found the region's
names for a specific table in the right order, and from that build the
HRegion objects. That way, if on a 1000 nodes cluster, the table is
just on 10 of them, I don't have do wait for the 1000 calls to end.
But I'm not able to get the RegionLoad and the HRegion objects from
that.

So I will continue with the getClusterStatus until I found a better solution.

JM

2013/3/29 Enis Söztutar <[EMAIL PROTECTED]>:
> HRegionInfo and HRegion are very different beasts. HRegion is the main
> datastructure for region internals. You won't have access to it from the
> client side. HRegionInfo is just a metadata holder.
>
> Enis
>
>
> On Fri, Mar 29, 2013 at 1:00 AM, Sean Zhong <[EMAIL PROTECTED]> wrote:
>
>> Recuisive iteration over HDFS table folder an option? The performance
>> should be good!
>>
>>
>>
>>
>> On Sun, Mar 24, 2013 at 1:24 AM, Jean-Marc Spaggiari <
>> [EMAIL PROTECTED]> wrote:
>>
>> > Hi Ted,
>> >
>> > There is no JIRA opened for that since I was not sure if it was
>> > something required/useful/missing/etc.
>> >
>> > But maybe I'm going the wrong way. The idea is, for a given table, I
>> > want to have the size of all the store files, per region.
>> >
>> > So far I'm using getClusterStatus so I can get all the regions for all
>> > the tables. And then retrieve all the store files size.
>> >
>> > But in an environment where there is hundred tables with thousands
>> > column, it might take a bit to long to get all the region for a
>> > specific table.
>> >
>> > So the idea is to scan the META table to get all the regions for the
>> > table I'm looking for, and from there, being able to get the HRegion
>> > object for each of those regions...
>> >
>> > JM
>> >
>> > 2013/3/23 Ted Yu <[EMAIL PROTECTED]>:
>> > > HRegion is used on region server side.
>> > > Is this tracking on server side ?
>> > > If there is JIRA, giving us the JIRA number would help.
>> > >
>> > > Cheers
>> > >
>> > > On Sat, Mar 23, 2013 at 9:01 AM, Jean-Marc Spaggiari <
>> > > [EMAIL PROTECTED]> wrote:
>> > >
>> > >> Hi Ted,
>> > >>
>> > >> Yes, it's for 0.94 and newer.
>> > >>
>> > >> For a given table, and a given region, I want to get the
>> storeFileSize.
>> > >>
>> > >> I want to be able to track the storeFileSize per region per table. And
>> > >> since I already have the HRegionInfo I'm wondering if there is a way
>> > >> to use this to get the HRegion to call getStorefileSizeMB.
>> > >>
>> > >> I already found few ways to get the getStorefileSizeMB for a region,
>> > >> but none clean and easy using an HRegionInfo parameter.
>> > >>
>> > >> JM
>> > >>
>> > >> 2013/3/23 Ted Yu <[EMAIL PROTECTED]>:
>> > >> > Can you clarify your use case ?
>> > >> >
>> > >> > This is for 0.94 and newer releases, I assume.
>> > >> >
>> > >> > On Sat, Mar 23, 2013 at 8:37 AM, Jean-Marc Spaggiari <
>> > >> > [EMAIL PROTECTED]> wrote:
>> > >> >
>> > >> >> Hi,
>> > >> >>
>> > >> >> What's the best and cleanest way to get the HRegion object from the
>> > >> >> HRegionInfo one? Is there any utility class doing that?
>> > >> >>
>> > >> >> Thanks,
>> > >> >>
>> > >> >> JM
>> > >> >>
>> > >>
>> >
>>