Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Map<string, Map<string, string>> for TEXTFILE


Copy link to this message
-
RE: Map<string, Map<string, string>> for TEXTFILE
Let me clarify. By default (i.e. if you don't specify "row format ..." in your DDL), the delimiters for 1st-level maps are \002 and \003, the delimiters for 2nd-level maps are \004 and \005, etc., up to some hard-coded limit (in LazySimpleSerDe). Only the 1st level delimiters can be changed by specifying "row format ..." (see HIVE-365).

You can do this:

CREATE EXTERNAL TABLE map_test
(id int, m MAP<string, MAP<string, string>>)
location '/user/hadoop/map_test';

And have this in your data file (replace ^A with the \001 character, ^B with \002, etc.):

100^Aphysical^Cheight^E60^Dweight^E120.5^Binner^Coutgoing^E5.2

Then:

hive> select * from map_test;
100 {"physical":{"height":"60","weight":"120.5"},"inner":{"outgoing":"5.2"}}

Hope this helps.

Steven
-----Original Message-----
From: Sammy Yu [mailto:[EMAIL PROTECTED]]
Sent: Friday, September 10, 2010 12:05 PM
To: [EMAIL PROTECTED]
Subject: Re: Map<string, Map<string, string>> for TEXTFILE

Hi Steven,
  Thanks for the response, but is there a way to specify \004 and \005
as inner map delimiters?  I read up on HIVE-337, but I couldn't really
see how the inner map value can be escaped.  I tried escaping \003 as
\004\003 and \002 as \004\002, but that didn't seem to work.

Thanks,
Sammy

On Fri, Sep 10, 2010 at 11:27 AM, Steven Wong <[EMAIL PROTECTED]> wrote:
> I think you have to use \004 and \005 as delimiters for your inner maps.
>
>
> -----Original Message-----
> From: Sammy Yu [mailto:[EMAIL PROTECTED]]
> Sent: Friday, September 10, 2010 11:13 AM
> To: [EMAIL PROTECTED]
> Subject: Map<string, Map<string, string>> for TEXTFILE
>
> Hi guys,
>   I'm trying to create a textfile for a table with a Map<string,
> Map<string, string>> column:
>
> CREATE EXTERNAL TABLE map_test
> (id int,
> m MAP<string, MAP<string, string>>)
> row format delimited fields terminated by '\001'
>  escaped by '\004'
>  collection items terminated by '\002'
>  map keys terminated by '\003;
> location '/user/hadoop/map_test';
>
> I can't seem to figure out how to construct the serialized textfile
> form for the value of the inner map.  I've create a file with the
> following content:
> 100<\001>physical<\003>height<\003>60<\002>weight<\003>120.5<\003>inner<\003>outgoing<\003>5.2
> to represent id=100, m={"physical":{"height":"60",
> "weight":"120.5"},"inner":{"outgoing:"5.2"}}
>
> however when I do a query I get:
> hive> select * from map_test;
> 100     {"physical":{"height\u000360":null},"weight":{"120.5":null},"inner":{"outgoing\u00035.2":null}}
>
> I think I'm suppose to escape the value of the outer map's key.  Any
> help would be greatly appreciated!
>
> Thanks,
> Sammy
>
>

--
Chief Architect, BrightEdge
email: [EMAIL PROTECTED]   |   mobile: 650.539.4867  |   fax:
650.521.9678  |  address: 1850 Gateway Dr Suite 400, San Mateo, CA
94404
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB