I set mapred.reduce.tasks manually to have a single wave of reducers (does
that make sense, by the way?)
When I save the data, I often end up with a bunch of small files because we
use compression and Hive doesn't seem to merge small compressed files.
So my question is: can I disable mapred.reduce.tasks somehow and make Hive
use the hive.exec.reducers.bytes.per.reducer instead to reduce the number of
output files? It seems the former overrides the latter.