Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - Converting byte[] to ByteBuffer


Copy link to this message
-
RE: Converting byte[] to ByteBuffer
Jonathan Gray 2011-07-10, 07:59
There are plenty of arguments in both directions for caching above the DB, in the DB, or under the DB/in the FS.  I have significant interest in supporting large heaps and reducing GC issues within the HBase RegionServer and I am already running with local fs reads.  I don't think a faster dfs makes HBase caching irrelevant or the conversation a non-starter.

To get back to the original question, I ended up trying this once.  I wrote a rough implementation of a slab allocator a few months ago to dive in and see what it would take.  The big challenge is KeyValue and its various comparators.  The ByteBuffer API can be maddening at times but it can be done.  I ended up somewhere slightly more generic, where KeyValue was taking a ByteBlock which contained ref counting and a reference to the allocator it came from, in addition to a ByteBuffer.

The easy way to rely on DirectByteBuffers and the like would be to make a copy on read into a normal byte[], and then no need to worry about ref counting and revamping KV.  Of course, at the cost of short-term allocations.  In my experience, you can tune the GC around this and the cost really becomes CPU.

I'm in the process of re-implementing some of this stuff on top of the HFile v2 that is coming soon.  Once that goes in, this gets much easier at the HFile and block cache level (a new wrapper around ByteBuffer called HFileBlock which can be used for refc and such, instead of introducing huge changes for caching stuff)

JG

 
> -----Original Message-----
> From: Ted Dunning [mailto:[EMAIL PROTECTED]]
> Sent: Saturday, July 09, 2011 11:14 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Converting byte[] to ByteBuffer
>
> No.  The JNI is below the HDFS compatible API.  Thus the changed code is in
> the hadoop.jar and associated jars and .so's that MapR supplies.
>
> The JNI still runs in the HBase memory image, though, so it can make data
> available faster.
>
> The cache involved includes the cache of disk blocks (not HBase memcache
> blocks) in the JNI and in the filer sub-system.
>
> The detailed reasons why more caching in the file system and less in HBase
> makes the overall system faster are not completely worked out, but the
> general outlines are pretty clear.  There are likely several factors at work in
> any case including less GC cost due to smaller memory foot print, caching
> compressed blocks instead of Java structures and simplification due to a
> clean memory hand-off with associated strong demarcation of where
> different memory allocators have jurisdiction.
>
> On Sat, Jul 9, 2011 at 3:48 PM, Jason Rutherglen
> <[EMAIL PROTECTED]
> > wrote:
>
> > I'm a little confused, I was told none of the HBase code changed with
> > MapR, if the HBase (not the OS) block cache has a JNI implementation
> > then that part of the HBase code changed.
> > On Jul 9, 2011 11:19 AM, "Ted Dunning" <[EMAIL PROTECTED]> wrote:
> > > MapR does help with the GC because it *does* have a JNI interface
> > > into an external block cache.
> > >
> > > Typical configurations with MapR trim HBase down to the minimal
> > > viable
> > size
> > > and increase the file system cache correspondingly.
> > >
> > > On Fri, Jul 8, 2011 at 7:52 PM, Jason Rutherglen <
> > [EMAIL PROTECTED]
> > >> wrote:
> > >
> > >> MapR doesn't help with the GC issues. If MapR had a JNI interface
> > >> into an external block cache then that'd be a different story. :)
> > >> And I'm sure it's quite doable.
> > >>
> >