-Re: v0.20.203: How to compress files in Reducer
Harsh J 2012-04-13, 02:31
If you're using the APIs directly, instead of the framework's offered
APIs like MultipleOutputs and the like, you need to follow this:
OutputStream os = fs.open(…);
CompressionCodec codec = new GzipCodec(); // Or other codec. See also,
CompressionCodecFactory class for some helpers.
OutputStream cs = codec.getOutputStream(os);
// Now use cs as your output stream object for writes.
On Fri, Apr 13, 2012 at 6:14 AM, Piyush Kansal <[EMAIL PROTECTED]> wrote:
> I am creating o/p files in reducer using my own file name convention. So,
> using FileSystem APIs I am dumping data in the files. I now want to compress
> these files while writing so as to write lesser amount of data and also to
> save the space on HDFS.
> So, I tried following options, but none of them worked:
> - setting the "mapred.output.compress" to true
> - job.setOutputFormatClass( TextOutputFormat.class);
> TextOutputFormat.setCompressOutput(job, true);
> TextOutputFormat.setOutputCompressorClass(job, GzipCodec.class);
> - I also tried looking into the exiting FileSystem and FileUtil APIs but
> none of them has an API to write the file in compressed format
> Can you please suggest how can I achieve the required goal.
> Piyush Kansal