-appending data to clustered tables.
Sukhendu Chakraborty 2014-02-19, 23:45
Is there a way to add data into a bucketed/clustered table in hive-0.11. I
have a clustered table with 32 buckets (no partitions) with some data, can
I append more data by running a "insert into <table>...."? From
http://osdir.com/ml/hive-user-hadoop-apache/2009-03/msg00094.html it looks
like the feature is not supported till 2009.
When I tried experimenting with it in hive-0.11, I saw after the second
insert, a new set of 32 files were created with '000000_*.copy' notation.
So, we had 64 files instead of original 32. Is this an expected behavior
and hive knows how to merge the 64 files into 32 for each bucket before
processing? How about sorted bucketed tables?