Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Change in serdeproperties does not update existing partitions


Copy link to this message
-
Re: Change in serdeproperties does not update existing partitions
Hey Maxime,

Looks like there is some confusion here. You need not to recreate partition
any time you update something about the table. If you e.g. are adding new
columns, you can just do alter table add column.... and then alter table add
partition.. you need not to do anything about existing partition in those
cases and things will work fine. What I was suggesting was a workaround
because of lack of the functionality of changing serdeproperties of existing
partition. Ideally, it should be possible to do so, but currently that
feature is not there.

Hope it helps,
Ashutosh

On Tue, Sep 13, 2011 at 11:48, Maxime Brugidou <[EMAIL PROTECTED]>wrote:

> Thanks Ashutosh for your answer. I actually use external tables so that i
> don't drop my partitions data.
>
> This is still an odd behavior to me and I don't get why someone would
> expect it. Whenever I need to add a column to a table (my table here
> represent a log, and it is common to add fields to logs), I need to drop all
> partitions and recreate them. How do people do in general?
>
> Do you have a use case where people want to alter a table and not update
> existing partitions? Is it so that if your file format evolves you don't
> have to convert the whole history?
>
> Best,
> Maxime
>
> On Tue, Sep 13, 2011 at 7:03 PM, Ashutosh Chauhan <[EMAIL PROTECTED]>wrote:
>
>> Hey Maxime,
>>
>> Yeah, thats intended behavior. After you do alter on table, all subsequent
>> actions on table and partitions will inherit from it. If you want to modify
>> properties of already existing partitions, you should be able to do
>> something like 'alter table test_table partition (day='2011-09-02') set
>> serdeproperties ('input.regex' = '(.*)')' Unfortunately this is not
>> supported currently. Feel free to file a bug for that.
>>
>> A workaround (applicable only because you are using external table) is to
>> drop partition and then add them again. When you drop a partition from
>> external table, only metadata gets wiped out, data is not deleted, so when
>> you will add partition again, it will inherit from table serde properties
>> and you will get what you are looking for. Use this workaround with care,
>> you don't want to loose your data in recreating partitions.
>>
>> Hope it helps,
>> Ashutosh
>>
>> On Tue, Sep 13, 2011 at 06:03, Maxime Brugidou <[EMAIL PROTECTED]
>> > wrote:
>>
>>> Hello,
>>>
>>> I am using Hive 0.7 from cloudera cdh3u0 and I encounter a strange
>>> behavior when I update the serdeproperties of a table (for example for the
>>> RegexSerDe).
>>>
>>> If you have a simple partitioned table like
>>>
>>> create external table test_table (
>>>     id int)
>>> partitioned by (day string)
>>> row format serde 'org.apache.hadoop.contrib.serde2.RegexSerDe'
>>> with serdeproperties (
>>>     'input.regex' = '.* ([^ ]*)'
>>> );
>>>
>>> alter table test_table add partition (day='2011-09-01');
>>>
>>> alter table test_table set serdeproperties  (
>>>     'input.regex' = '(.*)'
>>> );
>>>
>>> alter table test_table add partition (day='2011-09-02');
>>>
>>>
>>> The first partition will still use the older regex and the new one will
>>> use the new regex. Is this intended behavior? Why?
>>>
>>> Thanks for your help,
>>> Maxime
>>>
>>>
>>
>