Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> Documenting Guidance on compression and codecs


Copy link to this message
-
Documenting Guidance on compression and codecs
Do we have a consolidated resource with information and recommendations
about use of the above? For instance, I ran a simple test using
PerformanceEvaluation, examining just the size of data on disk for 1G of
input data. The matrix below has some surprising results:

+--------------------+--------------+
| MODIFIER           | SIZE (bytes) |
+--------------------+--------------+
| none               |   1108553612 |
+--------------------+--------------+
| compression:SNAPPY |    427335534 |
+--------------------+--------------+
| compression:LZO    |    270422088 |
+--------------------+--------------+
| compression:GZ     |    152899297 |
+--------------------+--------------+
| codec:PREFIX       |   1993910969 |
+--------------------+--------------+
| codec:DIFF         |   1960970083 |
+--------------------+--------------+
| codec:FAST_DIFF    |   1061374722 |
+--------------------+--------------+
| codec:PREFIX_TREE  |   1066586604 |
+--------------------+--------------+

Where does a wayward soul look for guidance on which combination of the
above to choose for their application?

Thanks,
Nick
+
Ted Yu 2013-09-11, 20:19
+
lars hofhansl 2013-09-11, 20:30
+
Stack 2013-09-11, 21:29
+
Elliott Clark 2013-09-11, 20:22
+
Vladimir Rodionov 2013-09-11, 20:33
+
Nick Dimiduk 2013-09-19, 00:19
+
lars hofhansl 2013-09-19, 03:34
+
Ted Yu 2013-09-24, 20:11
+
Enis Söztutar 2013-09-26, 03:09
+
Enis Söztutar 2013-09-19, 00:30
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB