-Re: Behavior of Hive 2837: insert into external tables should not be allowed
Mark Grover 2012-06-01, 14:49
Thanks, Ashutosh and Ed.
Historically, I didn't have much reason choose managed over external tables or vice-versa since the semantics were very similar. I chose external because it allowed me a better handle on the table metadata. For example, if a new column got added to the file, I could just drop the external table and recreate with the new schema. With managed, I could do the same using ALTER TABLE commands but at that point, not all metadata for the table could be modified using ALTER TABLE commands so I decided to go with external tables. I think a lot of people use external tables on HDFS in preference to managed tables.
I did see the property hive.insert.into.external.tables but it's a all-or-none switch. If I had an HBase external table and a HDFS external table, it might very well be the case that I want to be able to insert into the HDFS backed external but not the HBase table. So, to me disallowing insert into all the external tables doesn't seem like the right thing to do. Like Ed suggested, it's dependent on the storage handler not on the table being external. I could go ahead and use table locking in that case, but that kinda defeats the purpose of this feature and property.
----- Original Message -----
From: "Ashutosh Chauhan" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Friday, June 1, 2012 10:24:24 AM
Subject: Re: Behavior of Hive 2837: insert into external tables should not be allowed
I understand your concern w.r.t backward compatibility. But as Ed pointed out there is a config variable and by default semantic is unchanged so you can continue to insert into your external table.
I have a question though. Why are you creating all your tables as "external" tables ? Why not regular tables?
On Thu, May 31, 2012 at 9:35 PM, Mark Grover < [EMAIL PROTECTED] > wrote:
I have a question regarding HIVE 2837(
https://issues.apache.org/jira/browse/HIVE-2837 ) that deals with
disallowing external table from using insert into queries.
>From looking at the JIRA, it seems like it applies to external tables on
HDFS as well. Technically, insert into should be ok for external tables on
HDFS (and S3 as well). Seems like a storage file system level thing to
specify whether insert into is applied and implement it.
Historically, there hasn't been any real difference between creating an
external table on HDFS vs creating a managed one. However, if we disallow
insert into on external tables, that would mean that folks with external
tables on HDFS wouldn't be able to make use of insert into functionality
even though they should be able to. Do we want to allow insert into on HDFS
tables regardless of whether they are external or not?