Hi guys,

I've encountered a situation where the ratio between "Map output bytes" and
"Map output materialized bytes" is quite huge and during the map-phase data
is spilled to disk quite a lot. This is something I'll try to optimize, but
I'm wondering if the spill files are compressed at all. I set
mapred.compress.map.output=true
and mapred.map.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec
and everything else seems to be working correctly. Does Hadoop actually
compress spills or just the final spill after finishing the entire map-task?

Thanks,
Sigurd
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB