Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - AW: how to overwrite output in HDFS?

Copy link to this message
AW: how to overwrite output in HDFS?
Christoph Schmitz 2012-04-03, 10:39
Hi Xin,

you can derive your own output format class from one of the Hadoop OutputFormats and make sure the "checkOutputSpecs" method, which usually does the checking, is empty:

public final class OverwritingTextOutputFormat<K, V> extends TextOutputFormat<K, V> {
    public void checkOutputSpecs(JobContext job) throws IOException {
 // Nothing


-----Ursprüngliche Nachricht-----
Von: Fang Xin [mailto:[EMAIL PROTECTED]]
Gesendet: Dienstag, 3. April 2012 11:35
An: mapreduce-user
Betreff: how to overwrite output in HDFS?

Hi, all

I'm writing my own map-reduce code using eclipse with hadoop plug-in.
I've specified input and output directories in the project property.
(two folders, namely input and output)

My problem is that each time when I do some modification and try to
run it again, i have to manually delete the previous output in HDFS,
otherwise there will be error.
Can anyone kindly suggest how to just simply overwrite the result?

Best regards,