Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> java.io.FileNotFoundException: File does not exist on modified data.


Copy link to this message
-
java.io.FileNotFoundException: File does not exist on modified data.
Hi all,

We are sometimes getting file not found exceptions while running large queries on hive. During these large queries we also import data on the partitions we are querying which raises a question for us. How does hive handle data which is being modified in the background?
We use insert overwrite on the partitions so I can imagine the large query can be surprised with some new files and some missing old files.
If others are experiencing this how do they work around this? Perhaps partition on 2 keys so you don't overwrite existing data?

Thanks for any pointers on this.
Bennie.