Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - Does output directory remain in case of map/reduce task failures


+
Pradeep Kamath 2010-11-02, 22:26
+
Namit Jain 2010-11-02, 23:21
Copy link to this message
-
RE: Does output directory remain in case of map/reduce task failures
Pradeep Kamath 2010-11-03, 00:16
Just to confirm - is this true for both partitioned and non partitioned tables?

________________________________
From: Namit Jain [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, November 02, 2010 4:22 PM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: RE: Does output directory remain in case of map/reduce task failures

Hive writes to a temporary directory first, and if the UDF fails, the temp. directory is removed.
The expected final directory is not touched.
-namit
________________________________
From: Pradeep Kamath [[EMAIL PROTECTED]]
Sent: Tuesday, November 02, 2010 3:26 PM
To: [EMAIL PROTECTED]
Subject: Does output directory remain in case of map/reduce task failures
Hi,
  While doing an insert into partitions if there is a failure in the map/reduce task (maybe due to a UDF bug), does hive cleanup the output directory corresponding to the partition? The behavior in hadoop is to NOT clean up output location in case of task failures (maybe to allow the user to debug). What is hive's behavior? Specifically if I have a table foo and I am writing to partition datestamp=20101102 then the write would go to /user/hive/warehouse/foo/datestamp=20101102. If the task(s) writing to this fail, does hive remove this dir on exit? If it doesn't, a subsequent attempt to write (presumably after fixing the cause of the earlier failure) would also fail unless the dir is removed first.

Pointers appreciated.

Thanks,
Pradeep
+
Namit Jain 2010-11-03, 00:18