Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Map<string, Map<string, string>> for TEXTFILE


Copy link to this message
-
RE: Map<string, Map<string, string>> for TEXTFILE
Let me clarify. By default (i.e. if you don't specify "row format ..." in your DDL), the delimiters for 1st-level maps are \002 and \003, the delimiters for 2nd-level maps are \004 and \005, etc., up to some hard-coded limit (in LazySimpleSerDe). Only the 1st level delimiters can be changed by specifying "row format ..." (see HIVE-365).

You can do this:

CREATE EXTERNAL TABLE map_test
(id int, m MAP<string, MAP<string, string>>)
location '/user/hadoop/map_test';

And have this in your data file (replace ^A with the \001 character, ^B with \002, etc.):

100^Aphysical^Cheight^E60^Dweight^E120.5^Binner^Coutgoing^E5.2

Then:

hive> select * from map_test;
100 {"physical":{"height":"60","weight":"120.5"},"inner":{"outgoing":"5.2"}}

Hope this helps.

Steven
-----Original Message-----
From: Sammy Yu [mailto:[EMAIL PROTECTED]]
Sent: Friday, September 10, 2010 12:05 PM
To: [EMAIL PROTECTED]
Subject: Re: Map<string, Map<string, string>> for TEXTFILE

Hi Steven,
  Thanks for the response, but is there a way to specify \004 and \005
as inner map delimiters?  I read up on HIVE-337, but I couldn't really
see how the inner map value can be escaped.  I tried escaping \003 as
\004\003 and \002 as \004\002, but that didn't seem to work.

Thanks,
Sammy

On Fri, Sep 10, 2010 at 11:27 AM, Steven Wong <[EMAIL PROTECTED]> wrote:
> I think you have to use \004 and \005 as delimiters for your inner maps.
>
>
> -----Original Message-----
> From: Sammy Yu [mailto:[EMAIL PROTECTED]]
> Sent: Friday, September 10, 2010 11:13 AM
> To: [EMAIL PROTECTED]
> Subject: Map<string, Map<string, string>> for TEXTFILE
>
> Hi guys,
>   I'm trying to create a textfile for a table with a Map<string,
> Map<string, string>> column:
>
> CREATE EXTERNAL TABLE map_test
> (id int,
> m MAP<string, MAP<string, string>>)
> row format delimited fields terminated by '\001'
>  escaped by '\004'
>  collection items terminated by '\002'
>  map keys terminated by '\003;
> location '/user/hadoop/map_test';
>
> I can't seem to figure out how to construct the serialized textfile
> form for the value of the inner map.  I've create a file with the
> following content:
> 100<\001>physical<\003>height<\003>60<\002>weight<\003>120.5<\003>inner<\003>outgoing<\003>5.2
> to represent id=100, m={"physical":{"height":"60",
> "weight":"120.5"},"inner":{"outgoing:"5.2"}}
>
> however when I do a query I get:
> hive> select * from map_test;
> 100     {"physical":{"height\u000360":null},"weight":{"120.5":null},"inner":{"outgoing\u00035.2":null}}
>
> I think I'm suppose to escape the value of the outer map's key.  Any
> help would be greatly appreciated!
>
> Thanks,
> Sammy
>
>

--
Chief Architect, BrightEdge
email: [EMAIL PROTECTED]   |   mobile: 650.539.4867  |   fax:
650.521.9678  |  address: 1850 Gateway Dr Suite 400, San Mateo, CA
94404