Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> AvroStorage compression ratio

Copy link to this message
Re: AvroStorage compression ratio
How do you generate your Avro files?
It worked OK for me with:

SET avro.mapred.deflate.level 5
inputData = LOAD 'input path' USING
STORE inputData INTO 'output path' USING

But I did these tests a long time ago with an old version.


On Sun, Oct 21, 2012 at 9:22 AM, Thejas Nair <[EMAIL PROTECTED]> wrote:
> Based on AvroStorage code and documentation, it looks like compression is
> enabled by default, codec set to "deflate". But the file size is almost same
> as that of uncompressed tab separated text data.
> This is probably a bug in AvroStorage, but I wanted to check if this is
> somehow expected, before I open a jira to track it.
> Uncompressed txt     2.12 GB
> avro (default compression)    2.09 GB
> avro + snappy compression     2.09 GB
> lzo compressed txt      0.69 GB
> Thanks,
> Thejas