Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Any advice about complex Hive tables?


+
Connell, Chuck 2012-10-08, 20:37
+
Sadananda Hegde 2012-10-12, 03:46
+
Connell, Chuck 2012-10-12, 14:17
+
Sadananda Hegde 2012-10-14, 00:31
+
Connell, Chuck 2012-10-14, 00:50
Copy link to this message
-
Re: Any advice about complex Hive tables?
Thanks, Chuck.

On Sat, Oct 13, 2012 at 7:50 PM, Connell, Chuck <[EMAIL PROTECTED]>wrote:

>  You can get the jar from the "Downloads" link on the serde's home page.
> It is on the right side of the page.
>
>
>  ------------------------------
> *From:* Sadananda Hegde [[EMAIL PROTECTED]]
> *Sent:* Saturday, October 13, 2012 8:31 PM
> *To:* [EMAIL PROTECTED]
>
> *Subject:* Re: Any advice about complex Hive tables?
>
>   Thanks, Chuck.
>
>
> But I am getting the following error when I execute the CREATE TABLE statement.
>
>  'FAILED: Error in metadata: Cannot validate serde: org.openx.data.jsonserde.JsonSerDe"
>
>
> Am I supposed to down load / add any jar file to hive server?
>
> Thanks for your help.
>
> Sadu
>
>  On Fri, Oct 12, 2012 at 9:17 AM, Connell, Chuck <[EMAIL PROTECTED]
> > wrote:
>
>>  Sadu,****
>>
>> ** **
>>
>> I am using JSON as the input format, with the JSON SerDe from
>> https://github.com/rcongiu/Hive-JSON-Serde. ****
>>
>> ** **
>>
>> A sample JSON record is:  (in actual use each JSON record must be on one
>> line only).****
>>
>> ** **
>>
>> {****
>>
>> "field1":"hello",****
>>
>> "field2":123456,****
>>
>> "field3":1234.5678,****
>>
>> "field4":true,****
>>
>> "field5":{"field5a":"embedded 1", "field5b":44, "field5c":4.44,
>> "field5d":false, "field5e":[12,13,14]},****
>>
>> "field6":[2, 3, 4, 5, 6],****
>>
>> "field7":[4.33, 5.33, 6.33],****
>>
>> "field8":["one", "two", "three"],****
>>
>> "field9":[[1,2,3], [4,5], [6,7,8]],****
>>
>> "field10":[["smith","jones"], ["bob", "bill"]],****
>>
>> "field11":[{"f11a":"one", "f11b":"two"}, {"f11a":"three", "f11b":"four"}]
>> ****
>>
>> }****
>>
>> ** **
>>
>> My table definition is:  (ignore the fact that the fields are listed out
>> of order, this does not matter)****
>>
>> ** **
>>
>> CREATE TABLE tt1****
>>
>> (****
>>
>> field8 ARRAY<STRING>,****
>>
>> field9 ARRAY<ARRAY<INT>>,****
>>
>> field2 INT,****
>>
>> field3 DOUBLE,****
>>
>> field1 STRING,****
>>
>> field6 ARRAY<INT>,****
>>
>> field7 ARRAY<DOUBLE>,****
>>
>> field4 BOOLEAN,****
>>
>> field5 STRUCT****
>>
>> <** **
>>
>> field5d:BOOLEAN,****
>>
>> field5e:ARRAY<INT>,****
>>
>> field5a:STRING,****
>>
>> field5b:INT,****
>>
>> field5c:DOUBLE****
>>
>> >,****
>>
>> field10 ARRAY<ARRAY<STRING>>,****
>>
>> field11 ARRAY<STRUCT****
>>
>> <** **
>>
>> f11a:STRING,****
>>
>> f11b:STRING****
>>
>> >>** **
>>
>> )****
>>
>> ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'****
>>
>> WITH SERDEPROPERTIES ("ignore.malformed.json" = "true")****
>>
>> STORED AS TEXTFILE;****
>>
>> ** **
>>
>> This small sample actually works great!  The problem is when I try to
>> scale up to larger more complex tables.****
>>
>> ** **
>>
>> (Note the latest version of this SerDe has a bug related to number
>> formats. You should use 1.1.3 and use only INTs and DOUBLEs.)****
>>
>> ** **
>>
>> Chuck****
>>
>> ** **
>>
>> ** **
>>
>> *From:* Sadananda Hegde [mailto:[EMAIL PROTECTED]]
>> *Sent:* Thursday, October 11, 2012 11:47 PM
>> *To:* [EMAIL PROTECTED]
>> *Subject:* Re: Any advice about complex Hive tables?****
>>
>> ** **
>>
>> Hi Chuck,****
>>
>>  ****
>>
>> I have a similar complex hive tables with many fields and some are nested
>> like array of structs (but only upto 3 levels). How did you define you ROW
>> FORMAT as to separate the items? The COLLECTION ITEMS TERMINATED BY clause
>> works only for the first level.How did you handle level 2 , 3, etc? Is it
>> through SERDE FORMATs?  Could you
>>  share your CREATE TABLE statement?  I am having problem correctly
>> defining my DDL to load the data file correctly.****
>>
>>  ****
>>
>> Much appreciated.****
>>
>>  ****
>>
>> Thanks,****
>>
>> Sadu****
>>
>> On Mon, Oct 8, 2012 at 3:37 PM, Connell, Chuck <[EMAIL PROTECTED]>
>> wrote:****
>>
>> (Follow up to the thread below...)
>>
>> I have a complex Hive table -- many fields, many nested structs. Hive
>> fails to create the table at all. I can't even start to load data or query
>> the data. Anyone else run into this? It seems to be a showstopper to using
+
Connell, Chuck 2012-10-14, 00:48
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB