Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> FROM INSERT after ADD COLUMN


+
yaboulna@... 2012-12-10, 00:35
+
Connell, Chuck 2012-12-10, 00:53
+
Shreepadma Venugopalan 2012-12-10, 00:53
+
yaboulna@... 2012-12-10, 01:02
Copy link to this message
-
Re: FROM INSERT after ADD COLUMN
I will reopen the subject a bit.

I don't know the details of the RCFile implementation in Hive but if the
data were stored that way it is theoretically possible to add the column
data even without append and without rewriting the whole file. Does someone
has more information on that matter?

Regards

Bertrand

On Mon, Dec 10, 2012 at 2:02 AM, <[EMAIL PROTECTED]> wrote:

> Hello Shreepadma,
>
> That's definitely very helpful. I doubted that this would be the case, but
> I was thinking that maybe there's a way to do it using a merge task. I will
> change my data structure to make it a bit like HBase, and I hope Hive would
> still be the right choice for me.. it can be backed by HBase anyway :).
> Thank you very much, your quick reply saved me a lot of time!
>
> Sincerely,
> Younos
>
>
> Quoting Shreepadma Venugopalan <[EMAIL PROTECTED]>:
>
>  Hi Younos,
>>
>> Since HiveQL doesn't support an insert..value statement, you can't insert
>> values into a specific column. Let's assume your table had the following
>> structure before the alter table..add columns statement was executed,
>>
>> tab (a string, b bigint, c double)
>>
>> Furthermore, let's assume that it had 100 rows. Now, let's assume you did
>> an alter table tab add columns (d binary). The new table structure will
>> look like below,
>>
>> tab (a string, b bigint, c double, d binary)
>>
>> You can't insert binary data into the 100 rows that were present prior to
>> the alter table statement by executing a HiveQL statement. HiveQL doesn't
>> support an insert..values statement like most RDBMSs. However, you can
>> delete the existing files and add new files that contain records
>> corresponding to the new table structure. Alternatively, you can skip the
>> deletion step and just add new files that correspond to the new table
>> structure. When you execute a HiveQL query, null will be returned for
>> those
>> columns for which the data doesn't exist.
>>
>> Hope this helps.
>>
>> Thanks.
>> Shreepadma
>>
>>
>> On Sun, Dec 9, 2012 at 4:35 PM, <[EMAIL PROTECTED]> wrote:
>>
>>  Hello,
>>>
>>> I couldn't find any example of how to populate columns that were added to
>>> a table. How would Hive tell which row to append by each value of the
>>> newly
>>> added columns? Does it do a column name matching?
>>>
>>> Sincerely,
>>> Younos
>>>
>>>
>>>
>>>
>>>
>>
>
>
> Best regards,
> Younos Aboulnaga
>
> Masters candidate
> David Cheriton school of computer science
> University of Waterloo
> http://cs.uwaterloo.ca
>
> E-Mail: [EMAIL PROTECTED]
> Mobile: +1 (519) 497-5669
>
>
>
>
--
Bertrand Dechoux
+
Shreepadma Venugopalan 2012-12-10, 18:32
+
Shreepadma Venugopalan 2012-12-10, 18:36
+
yaboulna@... 2012-12-10, 20:01