Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Questions about HBase


+
Pankaj Gupta 2013-06-05, 02:15
+
Ted Yu 2013-06-05, 03:44
+
ramkrishna vasudevan 2013-06-05, 03:14
+
Ted Yu 2013-06-05, 04:29
+
Anoop John 2013-06-05, 05:26
+
Pankaj Gupta 2013-06-05, 05:09
+
Asaf Mesika 2013-06-05, 05:27
+
ramkrishna vasudevan 2013-06-05, 05:43
+
Asaf Mesika 2013-06-05, 05:52
+
ramkrishna vasudevan 2013-06-05, 06:05
+
Pankaj Gupta 2013-06-05, 06:16
+
Pankaj Gupta 2013-06-05, 06:26
+
ramkrishna vasudevan 2013-06-05, 07:07
+
Anoop John 2013-06-05, 08:24
+
Pankaj Gupta 2013-06-06, 02:52
+
ramkrishna vasudevan 2013-06-06, 03:15
Copy link to this message
-
Re: Questions about HBase
> I feel that warming up the block and
index cache could be a useful feature for many workflows. Would it be a
good idea to have a JIRA for that?

This will be against the concept of multi level index block structure in
HFile V2 where we dont want the whole index data to be loaded at once
initially and kept in memory. When then HFile is so big mainly..   So the
whole point is it is upto the usage. Ya u can experiment with that and see
the perf diff ..  As long as it can be made as an optional warm up it can
help some usecases.

One more confirmation :  This index block misses were there at the initial
time only and later there is no such indication from the logs?

Good discussion ....

-Anoop-

On Thu, Jun 6, 2013 at 8:22 AM, Pankaj Gupta <[EMAIL PROTECTED]> wrote:

> I'm not sure what caused so many index block misses. At the time I ran the
> experiment had over 12 GB of RAM assigned to block cache. My understanding
> is that since I had restarted HBase before running this experiment it was
> basically loading index blocks as and when needed and thus index misses
> were spread over a period of time. I monitored the region server while
> running this debugging session and didn't see a single block eviction so it
> couldn't be that the index blocks were being kicked out by something else.
>
> I've got some really good information in this thread and I thank you all.
> The blockSeek function in HFileReaderV2 clearly confirms the linear nature
> of scan for finding a key in a block. I feel that warming up the block and
> index cache could be a useful feature for many workflows. Would it be a
> good idea to have a JIRA for that?
>
> Thanks,
> Pankaj
>
>
> On Wed, Jun 5, 2013 at 1:24 AM, Anoop John <[EMAIL PROTECTED]> wrote:
>
> > Why there are so many miss for the index blocks? WHat is the block cache
> > mem you use?
> >
> > On Wed, Jun 5, 2013 at 12:37 PM, ramkrishna vasudevan <
> > [EMAIL PROTECTED]> wrote:
> >
> > > I get your point Pankaj.
> > > Going thro the code to confirm it
> > >     // Data index. We also read statistics about the block index
> written
> > > after
> > >     // the root level.
> > >     dataBlockIndexReader.readMultiLevelIndexRoot(
> > >         blockIter.nextBlockWithBlockType(BlockType.ROOT_INDEX),
> > >         trailer.getDataIndexCount());
> > >
> > >     // Meta index.
> > >     metaBlockIndexReader.readRootIndex(
> > >         blockIter.nextBlockWithBlockType(BlockType.ROOT_INDEX),
> > >         trailer.getMetaIndexCount());
> > >
> > > We read the root level of the multilevel index and the actual root
> index.
> > > So as and when when we need new index blocks we will be hitting the
> disk
> > > and your observation is correct.  Sorry if i had confused you in this.
> > > The new version of HFile was mainly to address the concern in the
> > previous
> > > versoin where the entire indices was in memory.  The version V2
> addressed
> > > that concern like having the root level (something like metadata of the
> > > indices) and from there you should be able to get new index blocks.
> > > But there are chances that if you region size is small you may have
> only
> > > one level and the entire thing may be in memory.
> > >
> > > Regards
> > > Ram
> > >
> > >
> > > On Wed, Jun 5, 2013 at 11:56 AM, Pankaj Gupta <[EMAIL PROTECTED]>
> > > wrote:
> > >
> > > > Sorry, forgot to mention that I added the log statements to the
> method
> > > > readBlock in HFileReaderV2.java. I'm on hbase 0.94.2.
> > > >
> > > >
> > > > On Tue, Jun 4, 2013 at 11:16 PM, Pankaj Gupta <[EMAIL PROTECTED]
> >
> > > > wrote:
> > > >
> > > > > Some context on how I observed bloom filters being loaded
> > constantly. I
> > > > > added the following logging statements to HFileReaderV2.java:
> > > > > }
> > > > >         if (!useLock) {
> > > > >           // check cache again with lock
> > > > >           useLock = true;
> > > > >           continue;
> > > > >         }
> > > > >
> > > > >         // Load block from filesystem.
+
Pankaj Gupta 2013-06-05, 06:06
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB