Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> AW: how to overwrite output in HDFS?


Copy link to this message
-
AW: how to overwrite output in HDFS?
Hi Xin,

you can derive your own output format class from one of the Hadoop OutputFormats and make sure the "checkOutputSpecs" method, which usually does the checking, is empty:

-----------
public final class OverwritingTextOutputFormat<K, V> extends TextOutputFormat<K, V> {
    @Override
    public void checkOutputSpecs(JobContext job) throws IOException {
 // Nothing
    }
}
-----------

Regards,
Christoph

-----Ursprüngliche Nachricht-----
Von: Fang Xin [mailto:[EMAIL PROTECTED]]
Gesendet: Dienstag, 3. April 2012 11:35
An: mapreduce-user
Betreff: how to overwrite output in HDFS?

Hi, all

I'm writing my own map-reduce code using eclipse with hadoop plug-in.
I've specified input and output directories in the project property.
(two folders, namely input and output)

My problem is that each time when I do some modification and try to
run it again, i have to manually delete the previous output in HDFS,
otherwise there will be error.
Can anyone kindly suggest how to just simply overwrite the result?

Best regards,
Xin