Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> CompressionCodec in MapReduce


Copy link to this message
-
Re: CompressionCodec in MapReduce
append your custom codec full class name in "io.compression.codecs" either
in mapred-site.xml or in the configuration object pass to Job constructor.

the map reduce framework will try to guess the compress algorithm using the
input files suffix.

if any CompressionCodec.getDefaultExtension() register in the configuration
match the suffix,hadoop will try to instantiate the codec and decompress
for you ,if succeed,automatically.

the default value for "io.compression.codecs" is
"org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec"

On Wed, Apr 11, 2012 at 3:55 PM, Grzegorz Gunia
<[EMAIL PROTECTED]>wrote:

> Hello,
> I am trying to apply a custom CompressionCodec to work with MapReduce
> jobs, but I haven't found a way to inject it during the reading of input
> data, or during the write of the job results.
> Am I missing something, or is there no support for compressed files in the
> filesystem?
>
> I am well aware of how to set it up to work during the intermitent phases
> of the MapReduce operation, but I just can't find a way to apply it BEFORE
> the job takes place...
> Is there any other way except simply uncompressing the files I need prior
> to scheduling a job?
>
> Huge thanks for any help you can give me!
> --
> Greg
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB