Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> prefix compression


Copy link to this message
-
RE: prefix compression
I'm here!

Still parsing through the stuff in this e-mail but I agree that many approaches will require rejiggering of KeyValue in a significant way.  I have made some attempts at this but nothing is very far.  It's definitely something that we are hoping to put some resources into over the summer.

JG

> -----Original Message-----
> From: Todd Lipcon [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, June 02, 2011 3:37 PM
> To: [EMAIL PROTECTED]
> Subject: Re: prefix compression
>
> Hey Matt,
>
> Interesting email, and also something I've been thinking about recently.
>
> Unfortunately, I think one of the big prerequisites before we can start
> thinking about actual compression algorithms is some refactoring around
> how KeyValue is used. Currently, KeyValue exposes byte arrays in a lot of
> places
> - these byte arrays are expected to have discrete sub-ranges that represent
> row key, column qualifier, etc. Various parts of the code directly grab these
> byte arrays and offsets and do stuff with them (eg comparison, serialization,
> etc).
>
> In order to split the different components up, or do true prefix compression,
> we'd need to be able to have KeyValue do some smarter things for
> comparison, and never call functions like "getBuffer), getRowKeyOffset(),
> getRowKeyLength()" expecting that to show us a byte array range with a row
> key in it.
>
> I think Jonathan Gray has some code hacked together for this, but last I
> heard it wasn't in shippable state... Jonathan, you out there?
>
> As for the particular compression algorithms, a nice first step would be a very
> simplistic one: the first row key of any HFile block would be given in full.
> Following that, each key uses a single byte token which identifies the
> number of key components shared with the previous row (ie same row,
> same col, different ts). This isn't quite as good as true prefix compression,
> but for common cases where each cell has three versions, and each row has
> many columns, it's probably a huge savings.
>
> -Todd
>
> On Thu, Jun 2, 2011 at 3:28 PM, Matt Corgan <[EMAIL PROTECTED]>
> wrote:
>
> > I've been reading up on prefix compression for a few weeks trying to
> > figure out the best approach.  Maybe some of you have noticed that 90%
> > of my emails to the user list have something to do with it...  Now
> > that people are working on HFile v2 and the topic of Lucene's FST has
> > come up, I thought it would be a good time to share some thoughts.  My
> > goal here is definitely not asking anyone to implement this, but just
> > to start a discussion that gets the proper jiras in place.  I have
> > spent a little time coding an initial rowKey trie builder that could
> > be the start of a pluggable implementation.
> >
> > This is a big topic to address in one email, but I don't think the
> > pieces make sense on their own.  Here's an outline with some thoughts:
> >
> > * refer to prefix compression as "compaction" to avoid interfering
> > with traditional "compression"
> >
> > * main goal is to reduce size of data in memory to increase effective
> > capacity of block index, block cache, and memstores
> >    * disk access takes thousands of times longer than memory access,
> > especially over hdfs/network
> >
> > * secondary goal is to speed up the processing of in memory data
> >    * memory access takes 200+ cpu cycles, so must use cache friendly
> > algorithms
> >    * structure trie node sizes around typical 64B cache line size
> >    * investigate java's memory prefetching capabilities, possibly
> > increasing to ~256B nodes
> >        * research
> > paper<
> >
> http://www.google.com/url?sa=t&source=web&cd=5&sqi=2&ved=0CDcQFj
> AE&url
> >
> =http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3
> D10.1.
> >
> 1.68.4872%26rep%3Drep1%26type%3Dpdf&rct=j&q=databases%20prefetchi
> ng&ei
> > =7tnnTcSWLuXTiALSpNCUDA&usg=AFQjCNFpVC9oDsG1-
> UG0x6ZsqsIHLqvGDA&sig2=o0
> > 96l8cNrnX69oeBGChUKg
> > >about
> > modern memory databases and prefetching
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB