Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Change in serdeproperties does not update existing partitions


+
Maxime Brugidou 2011-09-13, 13:03
+
Ashutosh Chauhan 2011-09-13, 17:03
Copy link to this message
-
Re: Change in serdeproperties does not update existing partitions
Thanks Ashutosh for your answer. I actually use external tables so that i
don't drop my partitions data.

This is still an odd behavior to me and I don't get why someone would expect
it. Whenever I need to add a column to a table (my table here represent a
log, and it is common to add fields to logs), I need to drop all partitions
and recreate them. How do people do in general?

Do you have a use case where people want to alter a table and not update
existing partitions? Is it so that if your file format evolves you don't
have to convert the whole history?

Best,
Maxime

On Tue, Sep 13, 2011 at 7:03 PM, Ashutosh Chauhan <[EMAIL PROTECTED]>wrote:

> Hey Maxime,
>
> Yeah, thats intended behavior. After you do alter on table, all subsequent
> actions on table and partitions will inherit from it. If you want to modify
> properties of already existing partitions, you should be able to do
> something like 'alter table test_table partition (day='2011-09-02') set
> serdeproperties ('input.regex' = '(.*)')' Unfortunately this is not
> supported currently. Feel free to file a bug for that.
>
> A workaround (applicable only because you are using external table) is to
> drop partition and then add them again. When you drop a partition from
> external table, only metadata gets wiped out, data is not deleted, so when
> you will add partition again, it will inherit from table serde properties
> and you will get what you are looking for. Use this workaround with care,
> you don't want to loose your data in recreating partitions.
>
> Hope it helps,
> Ashutosh
>
> On Tue, Sep 13, 2011 at 06:03, Maxime Brugidou <[EMAIL PROTECTED]>wrote:
>
>> Hello,
>>
>> I am using Hive 0.7 from cloudera cdh3u0 and I encounter a strange
>> behavior when I update the serdeproperties of a table (for example for the
>> RegexSerDe).
>>
>> If you have a simple partitioned table like
>>
>> create external table test_table (
>>     id int)
>> partitioned by (day string)
>> row format serde 'org.apache.hadoop.contrib.serde2.RegexSerDe'
>> with serdeproperties (
>>     'input.regex' = '.* ([^ ]*)'
>> );
>>
>> alter table test_table add partition (day='2011-09-01');
>>
>> alter table test_table set serdeproperties  (
>>     'input.regex' = '(.*)'
>> );
>>
>> alter table test_table add partition (day='2011-09-02');
>>
>>
>> The first partition will still use the older regex and the new one will
>> use the new regex. Is this intended behavior? Why?
>>
>> Thanks for your help,
>> Maxime
>>
>>
>
+
Ashutosh Chauhan 2011-09-14, 11:45