Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> ROW FORMAT for nested structure


Copy link to this message
-
ROW FORMAT for nested structure
Hello,
My data file has 3 fields with '\t' as the field delimiter. Here is a
sample record.

1\t100;k1=v1|k2=v2,200,k3=v3|k4=v4\ttextfield

Field 1: is an integer field (value of 1 in the above example)
Field 2: is array of structure and  the array items are separated by ','
         the structure has 2 elements: integer and map type with ';' as the
separator
                                                   map  is a list of
key/value pairs with '=' separting key and value elements and '|'
separating the pairs
Field 3: is a string field ("textfield" on this example)

How should I be defining my ROW FORMAT in the CREATE TABLE Statement as?

CREATE TABLE test_t1 (
Fld1 INT,
Fld2 ARRAY <STRUCT <col1:INT, col2:map<STRING,STRING>>>,
Fld3 string)
ROW FORMAT ?????
STORED AS TEXT FILE

The expected values are:
   Fld1 = 1
   Fld2[0].col1 = 100, Fld[0].col2 = {"k1"="v1" "k2"="v2"}, Fld[1].col1 200, Fld[1].col2 = {"k3"="v3" "k4"="v4"}
   Fld3 = "textfield"
Thanks,
Sadu
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB