Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - prefix compression implementation

Copy link to this message
Re: prefix compression implementation
Matt Corgan 2011-09-17, 02:29
Ryan - thanks for the feedback.  The situation I'm thinking of where it's
useful to parse DirectBB without copying to heap is when you are serving
small random values out of the block cache.  At HotPads, we'd like to store
hundreds of GB of real estate listing data in memory so it can be quickly
served up at random.  We want to access many small values that are already
in memory, so basically skipping step 1 of 3 because values are already in
memory.  That being said, the DirectBB are not essential for us since we
haven't run into gb problems, i just figured it would be nice to support
them since they seem to be important to other people.

My motivation for doing this is to make hbase a viable candidate for a
large, auto-partitioned, sorted, *in-memory* database.  Not the usual
analytics use case, but i think hbase would be great for this.
On Fri, Sep 16, 2011 at 7:08 PM, Ryan Rawson <[EMAIL PROTECTED]> wrote:

> On Fri, Sep 16, 2011 at 6:47 PM, Matt Corgan <[EMAIL PROTECTED]> wrote:
> > I'm a little confused over the direction of the DBBs in general, hence
> the
> > lack of clarity in my code.
> >
> > I see value in doing fine-grained parsing of the DBB if you're going to
> have
> > a large block of data and only want to retrieve a small KV from the
> middle
> > of it.  With this trie design, you can navigate your way through the DBB
> > without copying hardly anything to the heap.  It would be a shame blow
> away
> > your entire L1 cache by loading a whole 256KB block onto heap if you only
> > want to read 200 bytes out of the middle... it can be done
> > ultra-efficiently.
> This paragraph is not factually correct.  The DirectByteBuffer vs main
> heap has nothing to do with the CPU cache.  Consider the following
> scenario:
> - read block from DFS
> - scan block in ram
> - prepare result set for client
> Pretty simple, we have a choice in step 1:
> - write to java heap
> - write to DirectByteBuffer off-heap controlled memory
> in either case, you are copying to memory, and therefore cycling thru
> the cpu cache (of course).  The difference is whether the Java GC has
> to deal with the aftermath or not.
> So the question "DBB or not" is not one about CPU caches, but one
> about garbage collection.  Of course, nothing is free, and dealing
> with DBB requires extensive in-situ bounds checking (look at the
> source code for that class!), and also requires manual memory
> management on the behalf of the programmer.  So you are faced with an
> expensive API (getByte is not as good at an array get), and a lot more
> homework to do.  I have decided it's not worth it personally and
> aren't chasing that line as a potential performance improvement, and I
> also would encourage you not to as well.
> Ultimately the DFS speed issues need to be solved by the DFS - HDFS
> needs more work, but alternatives are already there and are a lot
> faster.
> >
> > The problem is if you're going to iterate through an entire block made of
> > 5000 small KV's doing thousands of DBB.get(index) calls.  Those are like
> 10x
> > slower than byte[index] calls.  In that case, if it's a DBB, you want to
> > copy the full block on-heap and access it through the byte[] interface.
>  If
> > it's a HeapBB, then you already have access to the underlying byte[].
> Yes this is the issue - you have to take an extra copy one way or
> another.  Doing effective prefix compression with DBB is not really
> feasible imo, and that's another reason why I have given up on DBBs.
> >
> > So there's possibly value in implementing both methods.  The main problem
> i
> > see is a lack of interfaces in the current code base.  I'll throw one
> > suggestion out there as food for thought.  Create a new interface:
> >
> > interface HCell{
> >  byte[] getRow();
> >  byte[] getFamily();
> >  byte[] getQualifier();
> >  long getTimestamp();
> >  byte getType();
> >  byte[] getValue();
> >
> >  //plus an endless list of convenience methods:
> >  int getKeyLength();