Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> reduce output compression of Terasort


Copy link to this message
-
Re: reduce output compression of Terasort
As far as I know, TeraOutputFormat don't support compression

On Fri, Feb 17, 2012 at 6:16 PM, <[EMAIL PROTECTED]> wrote:

> **
> Juwei
> Is there any error messages on your TaskTracker logs related to
> compression like 'Codec not found' or so ?
> Regards
> Bejoy K S
>
> From handheld, Please excuse typos.
> ------------------------------
> *From: * Juwei Shi <[EMAIL PROTECTED]>
> *Date: *Fri, 17 Feb 2012 17:48:08 +0800
> *To: *<[EMAIL PROTECTED]>
> *ReplyTo: * [EMAIL PROTECTED]
> *Subject: *Re: reduce output compression of Terasort
>
> We use LZO, so the value is
> mapred.output.compression.codec = com.hadoop.compression.lzo.LzoCodec
>
> No compressed file in HDFS.
>
>
>
> 2012/2/17 Bejoy Ks <[EMAIL PROTECTED]>
>
>> Hi Juwei
>>        What is the value for mapred.output.compression.codec? It'd be
>> better to determine whether the output files are compressed by getting the
>> codec of the same and not just from the size of files.
>>
>> Regards
>> Bejoy.K.S
>>
>>
>> On Fri, Feb 17, 2012 at 12:07 PM, Juwei Shi <[EMAIL PROTECTED]> wrote:
>>
>>> Hi,
>>>
>>> I am benchmarking the cluster using the Terasort package of Hadoop
>>> 0.20.2. I enabled compression for both map output (*
>>> mapred.compress.map.output*) and reduce output (*mapred.output.compress*).
>>> I checked the parameter in Job.xml, both are true. I can see that the
>>> compression for Map output works, but it seems that the compression for
>>> reduce output does not work. The output of the job on HDFS is also 1TB.
>>>
>>> Thanks!
>>>
>>> - Juwei
>>>
>>
>>
>
>
> --
> - Juwei Shi (史巨伟)
>