Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # dev - Re: Review Request 11854: HIVE-3253- ArrayIndexOutOfBounds exception for deeply nested structs


Copy link to this message
-
Re: Review Request 11854: HIVE-3253- ArrayIndexOutOfBounds exception for deeply nested structs
Thejas Nair 2013-07-02, 23:15

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11854/
-----------------------------------------------------------

(Updated July 2, 2013, 11:15 p.m.)
Review request for hive.
Changes
-------

Updates q.out files for 0.23
Bugs: HIVE-3253
    https://issues.apache.org/jira/browse/HIVE-3253
Repository: hive-git
Description
-------

(description patch from the jira comment )
It increases the number of control charactors used by LazySimpleSerde, avoiding the chars that are likely to be present in data. Using new control chars is not backward compatible change, so you need to set the serde property hive.serialization.extend.nesting.levels to enable it for a table that is using LazySimpleSerde. If your input table has data that might contain these delimiter control chars, you should escape the delimiter chars, and set escape char using serde property.
Example :
create table nestedcomplex (
simple_int int,
max_nested_array  array<array<array<array<array<array<array<array<array<array<array<array<array<array<array<array<array<array<array<array<array<array<array<int>>>>>>>>>>>>>>>>>>>>>>>)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (  'hive.serialization.extend.nesting.levels'='true'
)
;
LazySimpleSerde is used by FileSyncOperator, that is why it was limited by the number of levels of nesting supported by the serde. We should look at using LazyBinarySerde here as it would be more efficient and can go beyond this nesting level restriction.
LazySimpleSerde used in FileSyncOperator has escaping enabled, so it is safe to extend the levels of nesting using the new serde property for that use case.
The patch has fix to give better error message when the levels of nesting exceeds maximum supported levels (not an ArrayIndexOutOfBounds exception anymore)
Diffs (updated)
-----

  data/files/nested_complex.txt PRE-CREATION
  hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestLazyHBaseObject.java 3bd0919
  ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java 04921d5
  ql/src/test/queries/clientnegative/nested_complex_neg.q PRE-CREATION
  ql/src/test/queries/clientpositive/nested_complex.q PRE-CREATION
  ql/src/test/results/clientnegative/nested_complex_neg.q.out PRE-CREATION
  ql/src/test/results/clientpositive/alter_partition_coltype.q.out d9c48aa
  ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 492be3a
  ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out 7ed2448
  ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out 5b49c35
  ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 1b585bf
  ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out c5315fb
  ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out a9ab616
  ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out 7c4558f
  ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out fc2ffc5
  ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 3df0ca8
  ql/src/test/results/clientpositive/bucket_map_join_1.q.out 56131b0
  ql/src/test/results/clientpositive/bucket_map_join_2.q.out 1e7bea5
  ql/src/test/results/clientpositive/bucketcontext_1.q.out 43e34ce
  ql/src/test/results/clientpositive/bucketcontext_2.q.out ab44de5
  ql/src/test/results/clientpositive/bucketcontext_3.q.out 592765a
  ql/src/test/results/clientpositive/bucketcontext_4.q.out 6fc94a7
  ql/src/test/results/clientpositive/bucketcontext_5.q.out 8eb9a71
  ql/src/test/results/clientpositive/bucketcontext_6.q.out 8271292
  ql/src/test/results/clientpositive/bucketcontext_7.q.out db9bb1d
  ql/src/test/results/clientpositive/bucketcontext_8.q.out 21b5dc5
  ql/src/test/results/clientpositive/bucketmapjoin1.q.out 4bbd35f
  ql/src/test/results/clientpositive/bucketmapjoin10.q.out 3466e6d
  ql/src/test/results/clientpositive/bucketmapjoin11.q.out 1c12c09
  ql/src/test/results/clientpositive/bucketmapjoin12.q.out abf9783
  ql/src/test/results/clientpositive/bucketmapjoin13.q.out 870cb35
  ql/src/test/results/clientpositive/bucketmapjoin7.q.out b8ba7c0
  ql/src/test/results/clientpositive/bucketmapjoin8.q.out 2a5a5d5
  ql/src/test/results/clientpositive/bucketmapjoin9.q.out c2db270
  ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out 2230fd1
  ql/src/test/results/clientpositive/columnstats_partlvl.q.out 2c32730
  ql/src/test/results/clientpositive/columnstats_tbllvl.q.out 007bc31
  ql/src/test/results/clientpositive/combine2.q.out 1d51def
  ql/src/test/results/clientpositive/combine2_hadoop20.q.out 1ef67f4
  ql/src/test/results/clientpositive/filter_join_breaktask.q.out 52bac6a
  ql/src/test/results/clientpositive/groupby_sort_1.q.out e6f3a7a
  ql/src/test/results/clientpositive/groupby_sort_skew_1.q.out b7ca0ee
  ql/src/test/results/clientpositive/input23.q.out f71a43f
  ql/src/test/results/clientpositive/input42.q.out 67679af
  ql/src/test/results/clientpositive/input_part7.q.out 538a742
  ql/src/test/results/clientpositive/input_part9.q.out 91d1794
  ql/src/test/results/clientpositive/join_filters_overlap.q.out 4f79d38
  ql/src/test/results/clientpositive/list_bucket_dml_1.q.out 7d15a6c
  ql/src/test/results/clientpositive/list_bucket_dml_11.q.out d631b14
  ql/src/test/results/clientpositive/list_bucket_dml_12.q.out 343798d
  ql/src/test/results/clientpositive/list_bucket_dml_13.q.out 3a896fd
  ql/src/test/results/clientpositive/list_bucket_dml_2.q.out e95e05f
  ql/src/test/results/clientpositive/list_bucket_dml_3.q.out a197c8f
  ql/src/test/results/clientpositive/list_bucket_dml_4.q.out 795e2fc
  ql/src/test/results/clientpositive/list_bucket_dml_5.q.out acf0b69
  ql/src/test/results/clientpositive/list_bucket_dml_6.q.out 3d547dd
  ql/src/test/results/clientpositive/list_bucket_dml_7.q.out 8f39c7e
  ql/src/test/results/clientpositive/list_bucket_dml_8.q.out 8f9c0b2
  q