Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Is there a way to keep all intermediate files there after the MapReduce Job run?


Copy link to this message
-
Is there a way to keep all intermediate files there after the MapReduce Job run?
Dear all,
    In order to know more about the files creation and size when the job is
running, I want to keep all the intermediate files there (job.xml,
spillN.out, file.out, file.index, map.out-N, etc).

My question is :
1. Is there any configurations that can make this happen? Or could I modify
some Hadoop MapReduce code for this ?

2. Since each job, each task, and each attempt of the task using different
directories to store all the intermediate files, keeping the files there
without deleting will not hurt the whole MapReduce cluster except taking up
some storage. Am I write?

Thanks

yours,
Ling Kun

--
http://www.lingcc.com
+
Michael Segel 2013-03-01, 13:23
+
Jean-Marc Spaggiari 2013-03-01, 13:49
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB