Maxime Brugidou 2011-09-13, 13:03
Ashutosh Chauhan 2011-09-13, 17:03
-Re: Change in serdeproperties does not update existing partitions
Thanks Ashutosh for your answer. I actually use external tables so that i
don't drop my partitions data.
This is still an odd behavior to me and I don't get why someone would expect
it. Whenever I need to add a column to a table (my table here represent a
log, and it is common to add fields to logs), I need to drop all partitions
and recreate them. How do people do in general?
Do you have a use case where people want to alter a table and not update
existing partitions? Is it so that if your file format evolves you don't
have to convert the whole history?
On Tue, Sep 13, 2011 at 7:03 PM, Ashutosh Chauhan <[EMAIL PROTECTED]>wrote:
> Hey Maxime,
> Yeah, thats intended behavior. After you do alter on table, all subsequent
> actions on table and partitions will inherit from it. If you want to modify
> properties of already existing partitions, you should be able to do
> something like 'alter table test_table partition (day='2011-09-02') set
> serdeproperties ('input.regex' = '(.*)')' Unfortunately this is not
> supported currently. Feel free to file a bug for that.
> A workaround (applicable only because you are using external table) is to
> drop partition and then add them again. When you drop a partition from
> external table, only metadata gets wiped out, data is not deleted, so when
> you will add partition again, it will inherit from table serde properties
> and you will get what you are looking for. Use this workaround with care,
> you don't want to loose your data in recreating partitions.
> Hope it helps,
> On Tue, Sep 13, 2011 at 06:03, Maxime Brugidou <[EMAIL PROTECTED]>wrote:
>> I am using Hive 0.7 from cloudera cdh3u0 and I encounter a strange
>> behavior when I update the serdeproperties of a table (for example for the
>> If you have a simple partitioned table like
>> create external table test_table (
>> id int)
>> partitioned by (day string)
>> row format serde 'org.apache.hadoop.contrib.serde2.RegexSerDe'
>> with serdeproperties (
>> 'input.regex' = '.* ([^ ]*)'
>> alter table test_table add partition (day='2011-09-01');
>> alter table test_table set serdeproperties (
>> 'input.regex' = '(.*)'
>> alter table test_table add partition (day='2011-09-02');
>> The first partition will still use the older regex and the new one will
>> use the new regex. Is this intended behavior? Why?
>> Thanks for your help,
Ashutosh Chauhan 2011-09-14, 11:45