Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> ROW FORMAT for nested structure


Copy link to this message
-
ROW FORMAT for nested structure
Hello,
My data file has 3 fields with '\t' as the field delimiter. Here is a
sample record.

1\t100;k1=v1|k2=v2,200,k3=v3|k4=v4\ttextfield

Field 1: is an integer field (value of 1 in the above example)
Field 2: is array of structure and  the array items are separated by ','
         the structure has 2 elements: integer and map type with ';' as the
separator
                                                   map  is a list of
key/value pairs with '=' separting key and value elements and '|'
separating the pairs
Field 3: is a string field ("textfield" on this example)

How should I be defining my ROW FORMAT in the CREATE TABLE Statement as?

CREATE TABLE test_t1 (
Fld1 INT,
Fld2 ARRAY <STRUCT <col1:INT, col2:map<STRING,STRING>>>,
Fld3 string)
ROW FORMAT ?????
STORED AS TEXT FILE

The expected values are:
   Fld1 = 1
   Fld2[0].col1 = 100, Fld[0].col2 = {"k1"="v1" "k2"="v2"}, Fld[1].col1 200, Fld[1].col2 = {"k3"="v3" "k4"="v4"}
   Fld3 = "textfield"
Thanks,
Sadu