Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Questions about HBase


Copy link to this message
-
Re: Questions about HBase
Why there are so many miss for the index blocks? WHat is the block cache
mem you use?

On Wed, Jun 5, 2013 at 12:37 PM, ramkrishna vasudevan <
[EMAIL PROTECTED]> wrote:

> I get your point Pankaj.
> Going thro the code to confirm it
>     // Data index. We also read statistics about the block index written
> after
>     // the root level.
>     dataBlockIndexReader.readMultiLevelIndexRoot(
>         blockIter.nextBlockWithBlockType(BlockType.ROOT_INDEX),
>         trailer.getDataIndexCount());
>
>     // Meta index.
>     metaBlockIndexReader.readRootIndex(
>         blockIter.nextBlockWithBlockType(BlockType.ROOT_INDEX),
>         trailer.getMetaIndexCount());
>
> We read the root level of the multilevel index and the actual root index.
> So as and when when we need new index blocks we will be hitting the disk
> and your observation is correct.  Sorry if i had confused you in this.
> The new version of HFile was mainly to address the concern in the previous
> versoin where the entire indices was in memory.  The version V2 addressed
> that concern like having the root level (something like metadata of the
> indices) and from there you should be able to get new index blocks.
> But there are chances that if you region size is small you may have only
> one level and the entire thing may be in memory.
>
> Regards
> Ram
>
>
> On Wed, Jun 5, 2013 at 11:56 AM, Pankaj Gupta <[EMAIL PROTECTED]>
> wrote:
>
> > Sorry, forgot to mention that I added the log statements to the method
> > readBlock in HFileReaderV2.java. I'm on hbase 0.94.2.
> >
> >
> > On Tue, Jun 4, 2013 at 11:16 PM, Pankaj Gupta <[EMAIL PROTECTED]>
> > wrote:
> >
> > > Some context on how I observed bloom filters being loaded constantly. I
> > > added the following logging statements to HFileReaderV2.java:
> > > }
> > >         if (!useLock) {
> > >           // check cache again with lock
> > >           useLock = true;
> > >           continue;
> > >         }
> > >
> > >         // Load block from filesystem.
> > >         long startTimeNs = System.nanoTime();
> > >         HFileBlock hfileBlock > > > fsBlockReader.readBlockData(dataBlockOffset,
> > >             onDiskBlockSize, -1, pread);
> > >         hfileBlock = dataBlockEncoder.diskToCacheFormat(hfileBlock,
> > >             isCompaction);
> > >         validateBlockType(hfileBlock, expectedBlockType);
> > >         passSchemaMetricsTo(hfileBlock);
> > >         BlockCategory blockCategory > > > hfileBlock.getBlockType().getCategory();
> > >
> > > // My logging statements ---->
> > >         if(blockCategory == BlockCategory.INDEX) {
> > >           LOG.info("index block miss, reading from disk " + cacheKey);
> > >         } else if (blockCategory == BlockCategory.BLOOM) {
> > >           LOG.info("bloom block miss, reading from disk " + cacheKey);
> > >         } else {
> > >           LOG.info("block miss other than index or bloom, reading from
> > > disk " + cacheKey);
> > >         }
> > > //-------------->
> > >         final long delta = System.nanoTime() - startTimeNs;
> > >         HFile.offerReadLatency(delta, pread);
> > >         getSchemaMetrics().updateOnCacheMiss(blockCategory,
> isCompaction,
> > > delta);
> > >
> > >         // Cache the block if necessary
> > >         if (cacheBlock && cacheConf.shouldCacheBlockOnRead(
> > >             hfileBlock.getBlockType().getCategory())) {
> > >           cacheConf.getBlockCache().cacheBlock(cacheKey, hfileBlock,
> > >               cacheConf.isInMemory());
> > >         }
> > >
> > >         if (hfileBlock.getBlockType() == BlockType.DATA) {
> > >           HFile.dataBlockReadCnt.incrementAndGet();
> > >         }
> > >
> > > With these in place I saw the following statements in log:
> > > 2013-06-05 01:04:55,281 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: index block miss,
> reading
> > > from disk 11958ab7a4a1492e853743b02e1bd7b1_30361506
> > > 2013-06-05 01:05:00,579 INFO
> > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: index block miss,
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB