Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> stop generating these "part-XXXX" empty files when using MultipleOutputs in mapreduce job


Copy link to this message
-
Re: stop generating these "part-XXXX" empty files when using MultipleOutputs in mapreduce job
Use the LazyOutputFormat.

Have a look at this:
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.html
and
http://stackoverflow.com/questions/6137139/how-to-save-only-non-empty-reducers-output-in-hdfs

Niels Basjes
On Mon, Oct 28, 2013 at 8:11 PM, S. Zhou <[EMAIL PROTECTED]> wrote:

> I use MultipleOutputs so the output data are no longer stored in files
> "part-XXX". But they are still generated (though empty). Is it possible to
> stop generating these files when running MR job? (BTW, my MR job only has
> mapper). Thanks
>
> Senqiang
>
>
--
Best regards / Met vriendelijke groeten,

Niels Basjes
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB