Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> updating RegexSerde on existing partitions

Copy link to this message
updating RegexSerde on existing partitions

I have a hive table which has pre-defined schema and I use RegexSerde to
read data from the underlying files. I wanted to add a new column to this
table and so after running the ALTER TABLE command, I updated the
'input.regex' property for SERDEPROPERTIES. This did not help in any way.
The newly added column always returned null data. I figured the issue must
be because the past partitions are each tagged with their own SerDe
definition in PARTITIONS table. So I went ahead and loaded a new file and
saw that the new file did pick up the updated regex definition but when I
run a query against this new partition I always get a NullPointerException
with no additional information.

I was hoping I didn't have to run a ALTER TABLE on each partition to update
the regex property but maybe that's the only possible solution. But before
I go ahead and do that, I want to make sure my whole table will not break
with NPE's given that newly added partitions cannot be read with the
updated regex definition.

I have already tested the regex is fine by creating a temp table with that
regex definition and loading the same file into it and I was able to query
with no issues.

Anyone faced this issue before ? Any suggestions ? Or once defined it's
impossible to change RegexSerde tables ?