While doing an insert into partitions if there is a failure in the map/reduce task (maybe due to a UDF bug), does hive cleanup the output directory corresponding to the partition? The behavior in hadoop is to NOT clean up output location in case of task failures (maybe to allow the user to debug). What is hive's behavior? Specifically if I have a table foo and I am writing to partition datestamp=20101102 then the write would go to /user/hive/warehouse/foo/datestamp=20101102. If the task(s) writing to this fail, does hive remove this dir on exit? If it doesn't, a subsequent attempt to write (presumably after fixing the cause of the earlier failure) would also fail unless the dir is removed first.
Namit Jain 2010-11-02, 23:21
Pradeep Kamath 2010-11-03, 00:16
Namit Jain 2010-11-03, 00:18