Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Re: Bzip2 vs Gzip


Yes , bzip2 is splittable.
Tradeoffs - I have not done much experimentation with codecs.

Thanks,
Rahul
On Wed, Sep 18, 2013 at 2:07 AM, Amit Sela <[EMAIL PROTECTED]> wrote:

> Hi all,
> I'm using hadoop 1.0.4 and using gzip to keep the logs processed by hadoop
> (logs are gzipped into block size files).
> I read that bzip2 is splittable. Is it so in hadoop 1.0.4 ? Does that mean
> that any input file bigger then block size will be split between maps ?
> What are the tradeoffs between the two ?
>
> Thanks.
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB