-RE: Does output directory remain in case of map/reduce task failures
Pradeep Kamath 2010-11-03, 00:16
Just to confirm - is this true for both partitioned and non partitioned tables?
From: Namit Jain [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, November 02, 2010 4:22 PM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: RE: Does output directory remain in case of map/reduce task failures
Hive writes to a temporary directory first, and if the UDF fails, the temp. directory is removed.
The expected final directory is not touched.
From: Pradeep Kamath [[EMAIL PROTECTED]]
Sent: Tuesday, November 02, 2010 3:26 PM
To: [EMAIL PROTECTED]
Subject: Does output directory remain in case of map/reduce task failures
While doing an insert into partitions if there is a failure in the map/reduce task (maybe due to a UDF bug), does hive cleanup the output directory corresponding to the partition? The behavior in hadoop is to NOT clean up output location in case of task failures (maybe to allow the user to debug). What is hive's behavior? Specifically if I have a table foo and I am writing to partition datestamp=20101102 then the write would go to /user/hive/warehouse/foo/datestamp=20101102. If the task(s) writing to this fail, does hive remove this dir on exit? If it doesn't, a subsequent attempt to write (presumably after fixing the cause of the earlier failure) would also fail unless the dir is removed first.