Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> reduce output compression of Terasort


+
Juwei Shi 2012-02-17, 06:37
+
Bejoy Ks 2012-02-17, 09:18
+
Juwei Shi 2012-02-17, 09:48
Copy link to this message
-
Re: reduce output compression of Terasort
Juwei
      Is there any error messages on your TaskTracker logs related to compression like 'Codec not found' or so ?

Regards
Bejoy K S

From handheld, Please excuse typos.

-----Original Message-----
From: Juwei Shi <[EMAIL PROTECTED]>
Date: Fri, 17 Feb 2012 17:48:08
To: <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
Subject: Re: reduce output compression of Terasort

We use LZO, so the value is
mapred.output.compression.codec = com.hadoop.compression.lzo.LzoCodec

No compressed file in HDFS.

2012/2/17 Bejoy Ks <[EMAIL PROTECTED]>

> Hi Juwei
>        What is the value for mapred.output.compression.codec? It'd be
> better to determine whether the output files are compressed by getting the
> codec of the same and not just from the size of files.
>
> Regards
> Bejoy.K.S
>
>
> On Fri, Feb 17, 2012 at 12:07 PM, Juwei Shi <[EMAIL PROTECTED]> wrote:
>
>> Hi,
>>
>> I am benchmarking the cluster using the Terasort package of Hadoop
>> 0.20.2. I enabled compression for both map output (*
>> mapred.compress.map.output*) and reduce output (*mapred.output.compress*).
>> I checked the parameter in Job.xml, both are true. I can see that the
>> compression for Map output works, but it seems that the compression for
>> reduce output does not work. The output of the job on HDFS is also 1TB.
>>
>> Thanks!
>>
>> - Juwei
>>
>
>
--
- Juwei Shi (史巨伟)

+
Binglin Chang 2012-02-17, 12:05
+
Juwei Shi 2012-02-17, 15:21
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB