Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Documenting Guidance on compression and codecs


Copy link to this message
-
Re: Documenting Guidance on compression and codecs
Yeah, let's get some info in the book. At least what are the options for
compression.

Enis
On Wed, Sep 18, 2013 at 5:19 PM, Nick Dimiduk <[EMAIL PROTECTED]> wrote:

> For completeness, here's an entry for LZ4:
>
> +--------------------+--------------+
> | compression:LZ4    |    391017061 |
> +--------------------+--------------+
>
>
>
> On Wed, Sep 11, 2013 at 12:10 PM, Nick Dimiduk <[EMAIL PROTECTED]> wrote:
>
> > Do we have a consolidated resource with information and recommendations
> > about use of the above? For instance, I ran a simple test using
> > PerformanceEvaluation, examining just the size of data on disk for 1G of
> > input data. The matrix below has some surprising results:
> >
> > +--------------------+--------------+
> > | MODIFIER           | SIZE (bytes) |
> > +--------------------+--------------+
> > | none               |   1108553612 |
> > +--------------------+--------------+
> > | compression:SNAPPY |    427335534 |
> > +--------------------+--------------+
> > | compression:LZO    |    270422088 |
> > +--------------------+--------------+
> > | compression:GZ     |    152899297 |
> > +--------------------+--------------+
> > | codec:PREFIX       |   1993910969 |
> > +--------------------+--------------+
> > | codec:DIFF         |   1960970083 |
> > +--------------------+--------------+
> > | codec:FAST_DIFF    |   1061374722 |
> > +--------------------+--------------+
> > | codec:PREFIX_TREE  |   1066586604 |
> > +--------------------+--------------+
> >
> > Where does a wayward soul look for guidance on which combination of the
> > above to choose for their application?
> >
> > Thanks,
> > Nick
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB