Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # dev - Review Request 20096: HIVE-6835: Reading of partitioned Avro data fails if partition schema does not match table schema


Copy link to this message
-
Re: Review Request 20096: HIVE-6835: Reading of partitioned Avro data fails if partition schema does not match table schema
Anthony Hsu 2014-04-24, 17:43

This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20096/

(Updated April 24, 2014, 5:42 p.m.)
Review request for hive.
Changes

Removed PartitionDesc.getOverlayedProperties(). Created a new method SerDeUtils.createOverlayedProperties(). Changed behavior of SerDeUtils.initializeSerDe() and AbstractSerDe.initialize() to use the new SerDeUtils.createOverlayedProperties() method.
Repository: hive-git
Description

The problem occurs when you store the "avro.schema.(literal|url)" in the SERDEPROPERTIES instead of the TBLPROPERTIES, add a partition, change the table's schema, and then try reading from the old partition.

I fixed this problem by adding a new initialize() method to AbstractSerDe that takes both table properties and partition properties. The default implementation of this new method uses partition properties if its not null and table properties otherwise. I then overrode the new initalize() method in the AvroSerDe, and had the AvroSerDe always use the table properties. I also added a helper method that takes a Deserializer and calls the new initialize() method whenever the Deserializer is an instanceof AbstractSerDe. I then had to change all calls to Deserializer.initialize() to use my helper method instead.
Diffs (updated)

  contrib/src/java/org/apache/hadoop/hive/contrib/serde2/s3/S3LogDeserializer.java 69b618b
  contrib/src/test/org/apache/hadoop/hive/contrib/serde2/TestRegexSerDe.java 394ce3f
  hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseSerDe.java 089a31a
  hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InternalUtil.java fb650dd
  hcatalog/core/src/test/java/org/apache/hive/hcatalog/data/TestHCatRecordSerDe.java e84b789
  hcatalog/core/src/test/java/org/apache/hive/hcatalog/data/TestJsonSerDe.java c1d170a
  hcatalog/core/src/test/java/org/apache/hive/hcatalog/rcfile/TestRCFileMapReduceInputFormat.java 9dde771
  hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/DelimitedInputWriter.java 7ba6bb8
  hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/StrictJsonWriter.java 9b26550
  jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveQueryResultSet.java 3215178
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 1bbe02e
  ql/src/java/org/apache/hadoop/hive/ql/exec/DefaultFetchFormatter.java 25385ba
  ql/src/java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java b0b0925
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 6daf199
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableDummyOperator.java e00b7d3
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java c8003f5
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 80ccf5a
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 055d13e
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java 2416948
  ql/src/java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java 1354b36
  ql/src/java/org/apache/hadoop/hive/ql/exec/SkewJoinHandler.java 3bf58f6
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java c52a093
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java 2ef79d4
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordProcessor.java 0e4bdff
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java 49b8da1
  ql/src/java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java f339651
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 77305ff
  ql/src/java/org/apache/hadoop/hive/ql/plan/PTFDeserializer.java 3a258e4
  ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java 43cef5c
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableDesc.java 6144303
  ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java 755d783
  ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestPTFRowContainer.java cea3529
  ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/Utilities.java 4fc613e
  ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizedRowBatchCtx.java 7f3cb15
  ql/src/test/org/apache/hadoop/hive/ql/io/TestRCFile.java 5edd265
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 5664f3f
  ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java be518b9
  ql/src/test/queries/clientpositive/avro_partitioned.q 6fe5117
  ql/src/test/results/clientpositive/avro_partitioned.q.out 644716d
  serde/src/java/org/apache/hadoop/hive/serde2/AbstractSerDe.java 1ab15a8
  serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java d226d21
  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java 55bfa2e
  serde/src/test/org/apache/hadoop/hive/serde2/TestStatsSerde.java 9aa3c45
  serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerde.java a5d494f
  serde/src/test/org/apache/hadoop/hive/serde2/binarysortable/TestBinarySortableSerDe.java e512f42
  serde/src/test/org/apache/hadoop/hive/serde2/columnar/TestLazyBinaryColumnarSerDe.java e8639ff
  serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyArrayMapStruct.java 714045b
  serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazySimpleSerDe.java 28eb868
  serde/src/test/org/apache/hadoop/hive/serde2/lazybinary/TestLazyBinarySerDe.java 69c891d
  serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestCrossMapEqualComparer.java a69fcb7
  serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestSimpleMapEqualComparer.java dd9610e
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 2a113d5

Diff: https://reviews.apache.org/r/20096/diff/
Testing

Added test cases
Thanks,

Anthony Hsu