Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Any advice about complex Hive tables?


Copy link to this message
-
Re: Any advice about complex Hive tables?
Sadananda Hegde 2012-10-14, 00:31
Thanks, Chuck.
But I am getting the following error when I execute the CREATE TABLE statement.

 'FAILED: Error in metadata: Cannot validate serde:
org.openx.data.jsonserde.JsonSerDe"
Am I supposed to down load / add any jar file to hive server?

Thanks for your help.

Sadu

On Fri, Oct 12, 2012 at 9:17 AM, Connell, Chuck <[EMAIL PROTECTED]>wrote:

>  Sadu,****
>
> ** **
>
> I am using JSON as the input format, with the JSON SerDe from
> https://github.com/rcongiu/Hive-JSON-Serde. ****
>
> ** **
>
> A sample JSON record is:  (in actual use each JSON record must be on one
> line only).****
>
> ** **
>
> {****
>
> "field1":"hello",****
>
> "field2":123456,****
>
> "field3":1234.5678,****
>
> "field4":true,****
>
> "field5":{"field5a":"embedded 1", "field5b":44, "field5c":4.44,
> "field5d":false, "field5e":[12,13,14]},****
>
> "field6":[2, 3, 4, 5, 6],****
>
> "field7":[4.33, 5.33, 6.33],****
>
> "field8":["one", "two", "three"],****
>
> "field9":[[1,2,3], [4,5], [6,7,8]],****
>
> "field10":[["smith","jones"], ["bob", "bill"]],****
>
> "field11":[{"f11a":"one", "f11b":"two"}, {"f11a":"three", "f11b":"four"}]*
> ***
>
> }****
>
> ** **
>
> My table definition is:  (ignore the fact that the fields are listed out
> of order, this does not matter)****
>
> ** **
>
> CREATE TABLE tt1****
>
> (****
>
> field8 ARRAY<STRING>,****
>
> field9 ARRAY<ARRAY<INT>>,****
>
> field2 INT,****
>
> field3 DOUBLE,****
>
> field1 STRING,****
>
> field6 ARRAY<INT>,****
>
> field7 ARRAY<DOUBLE>,****
>
> field4 BOOLEAN,****
>
> field5 STRUCT****
>
> <** **
>
> field5d:BOOLEAN,****
>
> field5e:ARRAY<INT>,****
>
> field5a:STRING,****
>
> field5b:INT,****
>
> field5c:DOUBLE****
>
> >,****
>
> field10 ARRAY<ARRAY<STRING>>,****
>
> field11 ARRAY<STRUCT****
>
> <** **
>
> f11a:STRING,****
>
> f11b:STRING****
>
> >>** **
>
> )****
>
> ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'****
>
> WITH SERDEPROPERTIES ("ignore.malformed.json" = "true")****
>
> STORED AS TEXTFILE;****
>
> ** **
>
> This small sample actually works great!  The problem is when I try to
> scale up to larger more complex tables.****
>
> ** **
>
> (Note the latest version of this SerDe has a bug related to number
> formats. You should use 1.1.3 and use only INTs and DOUBLEs.)****
>
> ** **
>
> Chuck****
>
> ** **
>
> ** **
>
> *From:* Sadananda Hegde [mailto:[EMAIL PROTECTED]]
> *Sent:* Thursday, October 11, 2012 11:47 PM
> *To:* [EMAIL PROTECTED]
> *Subject:* Re: Any advice about complex Hive tables?****
>
> ** **
>
> Hi Chuck,****
>
>  ****
>
> I have a similar complex hive tables with many fields and some are nested
> like array of structs (but only upto 3 levels). How did you define you ROW
> FORMAT as to separate the items? The COLLECTION ITEMS TERMINATED BY clause
> works only for the first level.How did you handle level 2 , 3, etc? Is it
> through SERDE FORMATs?  Could you
>  share your CREATE TABLE statement?  I am having problem correctly
> defining my DDL to load the data file correctly.****
>
>  ****
>
> Much appreciated.****
>
>  ****
>
> Thanks,****
>
> Sadu****
>
> On Mon, Oct 8, 2012 at 3:37 PM, Connell, Chuck <[EMAIL PROTECTED]>
> wrote:****
>
> (Follow up to the thread below...)
>
> I have a complex Hive table -- many fields, many nested structs. Hive
> fails to create the table at all. I can't even start to load data or query
> the data. Anyone else run into this? It seems to be a showstopper to using
> Hive for us, which is a shame since everything else about Hive is just what
> we want.
>
> Any suggestions? Workarounds?
>
> Thanks,
> Chuck
>
>
> ________________________________________
> From: Connell, Chuck
> Sent: Thursday, October 04, 2012 4:31 PM
> To: [EMAIL PROTECTED]
> Subject: RE: Limit to columns or nesting of Hive table?
>
> The issue apparently is not just the number of levels of nesting. I just
> created a Hive table with 20 levels of structs within each other. It
> created fine. This is more levels than the table that was failing for me.