Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - updating RegexSerde on existing partitions


Copy link to this message
-
Re: updating RegexSerde on existing partitions
Viral Bajaria 2013-01-25, 20:21
Well the NPE was a mistake on my part. I was over-escaping the regex and
that was causing the NPE. The test that I ran was using the CLI, while the
update of regex was in code.

Regarding updating the regex for old partitions, I have not come across any
better way besides running it over all partitions. I was going to setup a
test instance and see if I could just update the metastore tables directly
since that will be faster than running ALTER TABLE one by one over 1000's
of partitions. Still debating though. I will let you know how it goes.

Thanks,
Viral

On Fri, Jan 25, 2013 at 11:38 AM, Mark Grover
<[EMAIL PROTECTED]>wrote:

> Viral,
> To me, I think the only option seems to be here to alter all the existing
> partitions.
>
> I would be interested in knowing what you ended up finding/doing.
>
> Thanks,
> Mark
>
>
> On Sun, Jan 20, 2013 at 3:58 PM, Viral Bajaria <[EMAIL PROTECTED]>wrote:
>
>> Hi,
>>
>> I have a hive table which has pre-defined schema and I use RegexSerde to
>> read data from the underlying files. I wanted to add a new column to this
>> table and so after running the ALTER TABLE command, I updated the
>> 'input.regex' property for SERDEPROPERTIES. This did not help in any way.
>> The newly added column always returned null data. I figured the issue must
>> be because the past partitions are each tagged with their own SerDe
>> definition in PARTITIONS table. So I went ahead and loaded a new file and
>> saw that the new file did pick up the updated regex definition but when I
>> run a query against this new partition I always get a NullPointerException
>> with no additional information.
>>
>> I was hoping I didn't have to run a ALTER TABLE on each partition to
>> update the regex property but maybe that's the only possible solution. But
>> before I go ahead and do that, I want to make sure my whole table will not
>> break with NPE's given that newly added partitions cannot be read with the
>> updated regex definition.
>>
>> I have already tested the regex is fine by creating a temp table with
>> that regex definition and loading the same file into it and I was able to
>> query with no issues.
>>
>> Anyone faced this issue before ? Any suggestions ? Or once defined it's
>> impossible to change RegexSerde tables ?
>>
>> Thanks,
>> Viral
>>
>
>