Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro, mail # user - Avro speed comparison with raw logs


+
felix gao 2011-03-02, 05:05
+
Doug Cutting 2011-03-04, 17:25
+
felix gao 2011-03-31, 01:26
+
Scott Carey 2011-03-31, 01:51
Copy link to this message
-
Re: Avro speed comparison with raw logs
Tatu Saloranta 2011-03-31, 17:08
On Wed, Mar 30, 2011 at 6:51 PM, Scott Carey <[EMAIL PROTECTED]> wrote:
> gzip/deflate is approximately the same speed to decompress for all
> compression levels.
> However, for compression, it varies by a factor of 5 or so between the
> fastest (1) and slowest (9).
>
> This is a useful link for gzip performance characteristics:
> http://tukaani.org/lzma/benchmarks.html

Also, a new project that compares performance & efficiency
(time/space) of JVM-accessible compression codecs is at:

https://github.com/ning/jvm-compressor-benchmark

and although default does not yet compare differences between deflate
levels would be easy to modify to also do that. Currently it does
include 2 deflate codecs, bzip2, quicklz, lzf and snappy (via JNI).

-+ Tatu +-

ps. It would be really nice to have benchmarks for "big data" use
cases for codecs -- jvm-serialization-benchmark for example just deals
with individual small messages. But there are multiple applicable data
formats, with very little good detailed comparative performance
benchmarking. :-/