Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - HRegion for HRegionInfo?

Copy link to this message
Re: HRegion for HRegionInfo?
Ted 2013-03-29, 12:20
Calculating size of all the store files seems to be an intermediate step.

Are you going to perform some action based on the result ?

I am asking this question because access to HRegion on client side is not provided for the reason Enis cited.

If you have a case for server side improvement, I'd love to hear about it.

On Mar 29, 2013, at 5:11 AM, Jean-Marc Spaggiari <[EMAIL PROTECTED]> wrote:

> Hi Sean, thanks for the suggestion. I will take a look that way too.
> Hi Enis,
> I agree that they are very different. So far I'm using
> getClusterStatus, getServers and getLoad but have to load ALL the
> regions for ALL the servers even if I just want to get one table. And
> also, to "rebuild" the regions order, I need to call that on all the
> servers first. Which might take a while for very big clusters with
> very big tables. For me (8 RS, 60 regions), it's efficient, but I have
> no idea how long it's going to take to call 1000 times HServerLoad
> load = status.getLoad(server). I will try to see how long one call is
> taking to see if it's efficient.
> The inital idea was to scan the Meta where I can found the region's
> names for a specific table in the right order, and from that build the
> HRegion objects. That way, if on a 1000 nodes cluster, the table is
> just on 10 of them, I don't have do wait for the 1000 calls to end.
> But I'm not able to get the RegionLoad and the HRegion objects from
> that.
> So I will continue with the getClusterStatus until I found a better solution.
> JM
> 2013/3/29 Enis Söztutar <[EMAIL PROTECTED]>:
>> HRegionInfo and HRegion are very different beasts. HRegion is the main
>> datastructure for region internals. You won't have access to it from the
>> client side. HRegionInfo is just a metadata holder.
>> Enis
>> On Fri, Mar 29, 2013 at 1:00 AM, Sean Zhong <[EMAIL PROTECTED]> wrote:
>>> Recuisive iteration over HDFS table folder an option? The performance
>>> should be good!
>>> On Sun, Mar 24, 2013 at 1:24 AM, Jean-Marc Spaggiari <
>>> [EMAIL PROTECTED]> wrote:
>>>> Hi Ted,
>>>> There is no JIRA opened for that since I was not sure if it was
>>>> something required/useful/missing/etc.
>>>> But maybe I'm going the wrong way. The idea is, for a given table, I
>>>> want to have the size of all the store files, per region.
>>>> So far I'm using getClusterStatus so I can get all the regions for all
>>>> the tables. And then retrieve all the store files size.
>>>> But in an environment where there is hundred tables with thousands
>>>> column, it might take a bit to long to get all the region for a
>>>> specific table.
>>>> So the idea is to scan the META table to get all the regions for the
>>>> table I'm looking for, and from there, being able to get the HRegion
>>>> object for each of those regions...
>>>> JM
>>>> 2013/3/23 Ted Yu <[EMAIL PROTECTED]>:
>>>>> HRegion is used on region server side.
>>>>> Is this tracking on server side ?
>>>>> If there is JIRA, giving us the JIRA number would help.
>>>>> Cheers
>>>>> On Sat, Mar 23, 2013 at 9:01 AM, Jean-Marc Spaggiari <
>>>>> [EMAIL PROTECTED]> wrote:
>>>>>> Hi Ted,
>>>>>> Yes, it's for 0.94 and newer.
>>>>>> For a given table, and a given region, I want to get the
>>> storeFileSize.
>>>>>> I want to be able to track the storeFileSize per region per table. And
>>>>>> since I already have the HRegionInfo I'm wondering if there is a way
>>>>>> to use this to get the HRegion to call getStorefileSizeMB.
>>>>>> I already found few ways to get the getStorefileSizeMB for a region,
>>>>>> but none clean and easy using an HRegionInfo parameter.
>>>>>> JM
>>>>>> 2013/3/23 Ted Yu <[EMAIL PROTECTED]>:
>>>>>>> Can you clarify your use case ?
>>>>>>> This is for 0.94 and newer releases, I assume.
>>>>>>> On Sat, Mar 23, 2013 at 8:37 AM, Jean-Marc Spaggiari <