Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Bzip2 vs Gzip

Copy link to this message
Bzip2 vs Gzip
Hi all,
I'm using hadoop 1.0.4 and using gzip to keep the logs processed by hadoop
(logs are gzipped into block size files).
I read that bzip2 is splittable. Is it so in hadoop 1.0.4 ? Does that mean
that any input file bigger then block size will be split between maps ?
What are the tradeoffs between the two ?