Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Question regarding nested complex data type


+
neha 2013-06-20, 08:42
+
Stephen Sprague 2013-06-20, 14:32
+
neha 2013-06-20, 14:45
Copy link to this message
-
Re: Question regarding nested complex data type
you only get three.  field separator, array elements separator (aka
collection delimiter), and map key/value separator (aka map key
delimiter).

when you  nest deeper then you gotta use the default '^D', '^E' etc for
each level.  At least that's been my experience which i've found has worked
successfully.
On Thu, Jun 20, 2013 at 7:45 AM, neha <[EMAIL PROTECTED]> wrote:

> Thanks a lot for your reply, Stephen.
> To answer your question - I was not aware of the fact that we could use
> delimiter (in my example, '|') for first level of nesting. I tried now and
> it worked fine.
>
> My next question - Is there any way to provide delimiter in DDL for second
> level of nesting?
> Thanks again!!
>
>
> On Thu, Jun 20, 2013 at 8:02 PM, Stephen Sprague <[EMAIL PROTECTED]>wrote:
>
>> its all there in the documentation under "create table" and it seems you
>> got everything right too except one little thing - in your second example
>> there for 'sample data loaded' - instead of '^B' change that to '|'  and
>> you should be good. That's the delimiter that separates your two array
>> elements - ie collections.
>>
>> i guess the real question for me is when you say 'since there is no way
>> to use given delimiter "|" ' what did you mean by that?
>>
>>
>>
>> On Thu, Jun 20, 2013 at 1:42 AM, neha <[EMAIL PROTECTED]> wrote:
>>
>>> Hi All,
>>>
>>> I have 2 questions about complex data types in nested composition.
>>>
>>> 1 >> I did not find a way to provide delimiter information in DDL if one
>>> or more column has nested array/struct. In this case, default delimiter has
>>> to be used for complex type column.
>>> Please let me know if this is a limitation as of now or I am missing
>>> something.
>>>
>>> e.g.:
>>> *DDL*:
>>> hive> create table example(col1 int, col2
>>> array<struct<st1:int,st2:string>>) row format delimited fields terminated
>>> by ',';
>>> OK
>>> Time taken: 0.226 seconds
>>>
>>> *Sample data loaded:*
>>> 1,1^Cstring1^B2^Cstring2
>>>
>>> *O/P:*
>>> hive> select * from example;
>>> OK
>>> 1    [{"st1":1,"st2":"string1"},{"st1":2,"st2":"string2"}]
>>> Time taken: 0.288 seconds
>>>
>>> 2 >> For the same DDL given above, if we provide clause* collection
>>> items terminated by '|' *and still use default delimiters (since there
>>> is no way to use given delimiter '|') then the select query shows incorrect
>>> data.
>>> Please let me know if this is something expected.
>>>
>>> e.g.
>>> *DDL*:
>>> hive> create table example(col1 int, col2
>>> array<struct<st1:int,st2:string>>) row format delimited fields terminated
>>> by ',' collection items terminated by '|';
>>> OK
>>> Time taken: 0.175 seconds
>>>
>>> *Sample data loaded:*
>>> 1,1^Cstring1^B2^Cstring2
>>>
>>> *O/P:
>>> *hive> select * from
>>> example;
>>>
>>> OK
>>> 1    [{"st1":1,"st2":"string1\u00022"}]
>>> Time taken: 0.141 seconds
>>> **
>>> Thanks & Regards.
>>>
>>
>>
>
+
Dean Wampler 2013-06-21, 02:00
+
Stephen Sprague 2013-06-21, 02:34
+
Dean Wampler 2013-06-21, 18:23
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB