Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Converting byte[] to ByteBuffer


Copy link to this message
-
RE: Converting byte[] to ByteBuffer
In my experience, CPU usage on HBase is very high for highly concurrent applications.  You can expect the CMS GC to chew up 2-3 cores at sufficient throughput and the remaining cores to be spent in CSLM/MemStore, KeyValue comparators, queues, etc.

> -----Original Message-----
> From: Jason Rutherglen [mailto:[EMAIL PROTECTED]]
> Sent: Sunday, July 10, 2011 3:05 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Converting byte[] to ByteBuffer
>
> Ted,
>
> Interesting.  I think we need to take a deeper look at why essentially turning
> off the caching of uncompressed blocks doesn't [seem to] matter.  My guess
> is it's cheaper to decompress on the fly than hog from the system IO cache
> with JVM heap usage.
>
> Ie, CPU is cheaper than disk IO.
>
> Further, (I asked this previously), where is the general CPU usage in HBase?
> Binary search on keys for seeking, skip list reads and writes, and [maybe]
> MapReduce jobs?  The rest should more or less be in the noise (or is general
> Java overhead).
>
> I'd be curious to know the avg CPU consumption of an active HBase system.
>
> On Sat, Jul 9, 2011 at 11:14 PM, Ted Dunning <[EMAIL PROTECTED]>
> wrote:
> > No.  The JNI is below the HDFS compatible API.  Thus the changed code
> > is in the hadoop.jar and associated jars and .so's that MapR supplies.
> >
> > The JNI still runs in the HBase memory image, though, so it can make
> > data available faster.
> >
> > The cache involved includes the cache of disk blocks (not HBase
> > memcache
> > blocks) in the JNI and in the filer sub-system.
> >
> > The detailed reasons why more caching in the file system and less in
> > HBase makes the overall system faster are not completely worked out,
> > but the general outlines are pretty clear.  There are likely several
> > factors at work in any case including less GC cost due to smaller
> > memory foot print, caching compressed blocks instead of Java
> > structures and simplification due to a clean memory hand-off with
> > associated strong demarcation of where different memory allocators have
> jurisdiction.
> >
> > On Sat, Jul 9, 2011 at 3:48 PM, Jason Rutherglen
> > <[EMAIL PROTECTED]
> >> wrote:
> >
> >> I'm a little confused, I was told none of the HBase code changed with
> >> MapR, if the HBase (not the OS) block cache has a JNI implementation
> >> then that part of the HBase code changed.
> >> On Jul 9, 2011 11:19 AM, "Ted Dunning" <[EMAIL PROTECTED]>
> wrote:
> >> > MapR does help with the GC because it *does* have a JNI interface
> >> > into an external block cache.
> >> >
> >> > Typical configurations with MapR trim HBase down to the minimal
> >> > viable
> >> size
> >> > and increase the file system cache correspondingly.
> >> >
> >> > On Fri, Jul 8, 2011 at 7:52 PM, Jason Rutherglen <
> >> [EMAIL PROTECTED]
> >> >> wrote:
> >> >
> >> >> MapR doesn't help with the GC issues. If MapR had a JNI interface
> >> >> into an external block cache then that'd be a different story. :)
> >> >> And I'm sure it's quite doable.
> >> >>
> >>
> >