Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Converting byte[] to ByteBuffer


Copy link to this message
-
RE: Converting byte[] to ByteBuffer
In my experience, CPU usage on HBase is very high for highly concurrent applications.  You can expect the CMS GC to chew up 2-3 cores at sufficient throughput and the remaining cores to be spent in CSLM/MemStore, KeyValue comparators, queues, etc.

> -----Original Message-----
> From: Jason Rutherglen [mailto:[EMAIL PROTECTED]]
> Sent: Sunday, July 10, 2011 3:05 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Converting byte[] to ByteBuffer
>
> Ted,
>
> Interesting.  I think we need to take a deeper look at why essentially turning
> off the caching of uncompressed blocks doesn't [seem to] matter.  My guess
> is it's cheaper to decompress on the fly than hog from the system IO cache
> with JVM heap usage.
>
> Ie, CPU is cheaper than disk IO.
>
> Further, (I asked this previously), where is the general CPU usage in HBase?
> Binary search on keys for seeking, skip list reads and writes, and [maybe]
> MapReduce jobs?  The rest should more or less be in the noise (or is general
> Java overhead).
>
> I'd be curious to know the avg CPU consumption of an active HBase system.
>
> On Sat, Jul 9, 2011 at 11:14 PM, Ted Dunning <[EMAIL PROTECTED]>
> wrote:
> > No.  The JNI is below the HDFS compatible API.  Thus the changed code
> > is in the hadoop.jar and associated jars and .so's that MapR supplies.
> >
> > The JNI still runs in the HBase memory image, though, so it can make
> > data available faster.
> >
> > The cache involved includes the cache of disk blocks (not HBase
> > memcache
> > blocks) in the JNI and in the filer sub-system.
> >
> > The detailed reasons why more caching in the file system and less in
> > HBase makes the overall system faster are not completely worked out,
> > but the general outlines are pretty clear.  There are likely several
> > factors at work in any case including less GC cost due to smaller
> > memory foot print, caching compressed blocks instead of Java
> > structures and simplification due to a clean memory hand-off with
> > associated strong demarcation of where different memory allocators have
> jurisdiction.
> >
> > On Sat, Jul 9, 2011 at 3:48 PM, Jason Rutherglen
> > <[EMAIL PROTECTED]
> >> wrote:
> >
> >> I'm a little confused, I was told none of the HBase code changed with
> >> MapR, if the HBase (not the OS) block cache has a JNI implementation
> >> then that part of the HBase code changed.
> >> On Jul 9, 2011 11:19 AM, "Ted Dunning" <[EMAIL PROTECTED]>
> wrote:
> >> > MapR does help with the GC because it *does* have a JNI interface
> >> > into an external block cache.
> >> >
> >> > Typical configurations with MapR trim HBase down to the minimal
> >> > viable
> >> size
> >> > and increase the file system cache correspondingly.
> >> >
> >> > On Fri, Jul 8, 2011 at 7:52 PM, Jason Rutherglen <
> >> [EMAIL PROTECTED]
> >> >> wrote:
> >> >
> >> >> MapR doesn't help with the GC issues. If MapR had a JNI interface
> >> >> into an external block cache then that'd be a different story. :)
> >> >> And I'm sure it's quite doable.
> >> >>
> >>
> >
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB