Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Meaning of storefileIndexSize


Copy link to this message
-
Re: Meaning of storefileIndexSize
Hi Stack,

On 18/05/10 16:51, Stack wrote:
>> after some tuning, like increasing the hfile block size to 128KB,  I have
>> noticed that the storefileIndexSize is now half of what it was before
>> (~250). Do storefileIndexSize is the size of the in-memory hfile block index
>> ?
>>      
> Yes.
>
> So, yes, doubling the block size should halve the index size.
>
> How come your index is so big?  Do you have big keys?  Lots of data?
> Lots of storefiles?
>    
We have 90M of rows, each rows varies from a few hundreds of kilobytes
to 8MB.

I have also changed at the same time another parameter, the
hbase.hregion.max.filesize. It was set to 1GB (from previous test), and
I switch it back to the default value (256MB).
So, in the previous tests, there was a few number of region files (like
250), but a very large index file size (>500).

In my last test (hregion.max.filesize=256, block size=128K), the number
of region files increased (I have now more than 1000 region file), but
the index file size is now less than 200.

Do you think the hregion.max.filesize could had impact on the index file
size ?

> Looking in HRegionServer I see that its calculated so:
>
>   storefileIndexSizeMB = (int)(store.getStorefilesIndexSize()/1024/1024);
>    
So, storefileIndexSize indicates the number of MB of heap used by the
index. And, in our case, 500 was too excessive given the fact that our
region server is limited to 1GB of heap.

Thanks.
--
Renaud Delbru