Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Avro + Snappy changing blocksize of snappy compression

Copy link to this message
Re: Avro + Snappy changing blocksize of snappy compression
On Wed, Apr 18, 2012 at 10:23 AM, snikhil0 <[EMAIL PROTECTED]> wrote:
> I am experimenting with Avro and snappy and want to plot the size of the
> compressed avro datafile as a function of varying compression block size. I
> am doing this by setting the configuration value for
> "io.compression.codec.snappy.buffersize". Unfortunately, this is not
> working: or more precisely for buffer sizes 256K to 2MB I get the same size
> output avro (snappyfied) data file. What am I missing? Someone had success
> with this?

Snappy uses blocks of 64k (like most LZ compressors), so there should
be little benefit from block sizes larger than this; blocks are
compressed independent from each other (back references are up to 8k
or such anyway). There are some compressors that can use larger
buffers, like bzip2 (I think). But those are more exceptions than

-+ Tatu +-