|
|
-
Re: Snappy in MapreduceHarsh J 2012-02-01, 11:23
Also, if you want finalized outputs in LZO, set
"mapred.output.compression.codec" to that codec. You have it set to Snappy presently. On Wed, Feb 1, 2012 at 2:04 PM, Marek Miglinski <[EMAIL PROTECTED]> wrote: > Hello guys, > > I have a Clouderas CDH3U2 package installed on a 3 node cluster and I've added to mapred-site: > <property> > <name>mapred.compress.map.output</name> > <value>true</value> > </property> > > <property> > <name>mapred.map.output.compression.codec</name> > <value>org.apache.hadoop.io.compress.SnappyCodec</value> > </property> > > Also to my pig job properties: > <property> > <name>io.compression.codec.lzo.class</name> > <value>com.hadoop.compression.lzo.LzoCodec</value> > </property> > <property> > <name>pig.tmpfilecompression</name> > <value>true</value> > </property> > <property> > <name>pig.tmpfilecompression.codec</name> > <value>lzo</value> > </property> > <property> > <name>mapred.output.compress</name> > <value>true</value> > </property> > <property> > <name>mapred.output.compression.codec</name> > <value>org.apache.hadoop.io.compress.SnappyCodec</value> > </property> > <property> > <name>mapred.output.compression.type</name> > <value>BLOCK</value> > </property> > <property> > <name>mapred.compress.map.output</name> > <value>true</value> > </property> > <property> > <name>mapred.map.output.compression.codec</name> > <value>org.apache.hadoop.io.compress.SnappyCodec</value> > </property> > <property> > <name>mapreduce.map.output.compress</name> > <value>true</value> > </property> > <property> > <name>mapreduce.map.output.compress.codec</name> > <value>org.apache.hadoop.io.compress.SnappyCodec</value> > </property> > > So I want PIG to compress it's data with LZO but mapreduce with Snappy, but as I see in the tasktracker details (Map Bytes Out) data is not compressed at all, which reduces performance a lot (IO is 100% most of the time)... What am I doing wrong and how do I fix it? > > > Thanks, > Marek M. -- Harsh J Customer Ops. Engineer Cloudera | http://tiny.cloudera.com/about |