Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Avro with Snappy compression on Hive


Copy link to this message
-
Re: Avro with Snappy compression on Hive
I've never Avro output with Hive, but just as a guess, try this:

SET avro.output.codec=snappy;

The mapred.output.compression.codec and mapred.output.compression.type
options are probably redundant.
On 25 April 2013 07:12, nir_zamir <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I have a Hive table created with the Avro Serde.
>
> When I add some data to it using the Snappy compression, it still looks
> compressed with deflate (the file starts with
> 'Obj...avro.codec.deflate.avro.Schema' where for raw data compressed with
> Snappy, the Snappy coded is specified at the beginning of the file).
>
> Anything I'm doing wrong?
>
> Here's what I do:
>
> CREATE TABLE p2c_comp_avro
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   TBLPROPERTIES (
>
>
> 'avro.schema.url'='file:///home/cloudera/bigdata/path_to_conversions_raw.avsc');
>
> SET hive.exec.compress.output=true;
> SET
> mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec;
> SET mapred.output.compression.type=BLOCK;
>
> INSERT OVERWRITE TABLE p2c_comp_avro SELECT * FROM p2c;
>
>
> Thanks!
>
>
>
> --
> View this message in context:
> http://apache-avro.679487.n3.nabble.com/Avro-with-Snappy-compression-on-Hive-tp4027079.html
> Sent from the Avro - Users mailing list archive at Nabble.com.
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB