Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Bzip2 vs Gzip


Copy link to this message
-
Bzip2 vs Gzip
Amit Sela 2013-09-17, 20:37
Hi all,
I'm using hadoop 1.0.4 and using gzip to keep the logs processed by hadoop
(logs are gzipped into block size files).
I read that bzip2 is splittable. Is it so in hadoop 1.0.4 ? Does that mean
that any input file bigger then block size will be split between maps ?
What are the tradeoffs between the two ?

Thanks.