Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> reduce output compression of Terasort


+
Juwei Shi 2012-02-17, 06:37
+
Bejoy Ks 2012-02-17, 09:18
+
Juwei Shi 2012-02-17, 09:48
+
bejoy.hadoop@... 2012-02-17, 10:16
Copy link to this message
-
Re: reduce output compression of Terasort
As far as I know, TeraOutputFormat don't support compression

On Fri, Feb 17, 2012 at 6:16 PM, <[EMAIL PROTECTED]> wrote:

> **
> Juwei
> Is there any error messages on your TaskTracker logs related to
> compression like 'Codec not found' or so ?
> Regards
> Bejoy K S
>
> From handheld, Please excuse typos.
> ------------------------------
> *From: * Juwei Shi <[EMAIL PROTECTED]>
> *Date: *Fri, 17 Feb 2012 17:48:08 +0800
> *To: *<[EMAIL PROTECTED]>
> *ReplyTo: * [EMAIL PROTECTED]
> *Subject: *Re: reduce output compression of Terasort
>
> We use LZO, so the value is
> mapred.output.compression.codec = com.hadoop.compression.lzo.LzoCodec
>
> No compressed file in HDFS.
>
>
>
> 2012/2/17 Bejoy Ks <[EMAIL PROTECTED]>
>
>> Hi Juwei
>>        What is the value for mapred.output.compression.codec? It'd be
>> better to determine whether the output files are compressed by getting the
>> codec of the same and not just from the size of files.
>>
>> Regards
>> Bejoy.K.S
>>
>>
>> On Fri, Feb 17, 2012 at 12:07 PM, Juwei Shi <[EMAIL PROTECTED]> wrote:
>>
>>> Hi,
>>>
>>> I am benchmarking the cluster using the Terasort package of Hadoop
>>> 0.20.2. I enabled compression for both map output (*
>>> mapred.compress.map.output*) and reduce output (*mapred.output.compress*).
>>> I checked the parameter in Job.xml, both are true. I can see that the
>>> compression for Map output works, but it seems that the compression for
>>> reduce output does not work. The output of the job on HDFS is also 1TB.
>>>
>>> Thanks!
>>>
>>> - Juwei
>>>
>>
>>
>
>
> --
> - Juwei Shi (史巨伟)
>
+
Juwei Shi 2012-02-17, 15:21
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB