-java.io.FileNotFoundException: File does not exist on modified data.
Bennie Schut 2010-09-20, 07:00
We are sometimes getting file not found exceptions while running large queries on hive. During these large queries we also import data on the partitions we are querying which raises a question for us. How does hive handle data which is being modified in the background?
We use insert overwrite on the partitions so I can imagine the large query can be surprised with some new files and some missing old files.
If others are experiencing this how do they work around this? Perhaps partition on 2 keys so you don't overwrite existing data?
Thanks for any pointers on this.