Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> LZO output compression


+
w00t w00t 2013-08-13, 09:26
+
w00t w00t 2013-08-13, 09:39
Copy link to this message
-
Re: LZO output compression
Check this class where these are defined
http://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1/src/mapred/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.java

From: w00t w00t <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Reply-To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>, w00t w00t <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: Tuesday, August 13, 2013 2:39 AM
To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>, w00t w00t <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Subject: Re: LZO output compression

Oh, I could get it working using these settings:

SET hive.exec.compress.output=true;
SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzopCodec;

But I have one question, where maybe on of you can help me with an explaination:
As I am running Hadoop 1.1.* why do I need the old command for Hadoop 0.20?:
SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzopCodec;

I supposed the commands for the newer Hadoop versions are:
SET hive.exec.compress.output=true;
SET mapreduce.output.fileoutputformat.compress=true;
SET mapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec;
________________________________
Von: w00t w00t <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
An: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Gesendet: 11:26 Dienstag, 13.August 2013
Betreff: LZO output compression

Hello,

I am running Hortonworks 1.2 using Hadoop 1.1.2.21 and Hive 0.10.0.21.

I set up LZO compression and can read LZO compressed data without problems.

My next try was to test output compression.
Therefore, I created the following small script:
--------------------------------------------------------------------------------------------------------------------------
SET hive.exec.compress.output=true;
SET mapreduce.output.fileoutputformat.compress=true;
SET mapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec;

DROP TABLE IF EXISTS simple_lzo;

CREATE TABLE simple_lzo
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' AS
SELECT count(*)
FROM txt_table_lzo;
----------------------------------------------------------------------------------------------------------------------------
The output gets compressed but with default-codec "deflate" - not with LZO.

Do you know what the problem could be here and how I could debug it?
There are no error messages or so.

Additionally, I also tried the commands for Hadoop 0.20:
mapred.output.compress=true;
mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzopCodec

That didn't work as well.

In Pig or Java MR, I have no problems to gerneate LZO compressed output.

Thanks

CONFIDENTIALITY NOTICE
=====================This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB