Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # dev >> sync interval for AvroOutputFormat

Copy link to this message
sync interval for AvroOutputFormat
AvroOutputFormat supports setting deflate level, but not the sync interval.
 Was this a conscious decision (i.e. would there be drawbacks of making the
sync interval larger)?

In some tests that I've done, Avro data files were over 50% smaller when I
upped the sync interval to 2MB (default is 16000 bytes).  I also saw a
modest speedup in building the files (I suspect my program was IO-bound).

Would folks support a patch to add setting a sync interval as a static
configuration option to AvroOutputFormat?