Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> RE: [Bulk] pig 0.10.0 JsonLoader and nested list


+
David Parks 2013-05-22, 10:26
Copy link to this message
-
Re: [Bulk] pig 0.10.0 JsonLoader and nested list
It seems that JSonLoader schema is not well documented. Could someone give
us more examples?
On Wed, May 22, 2013 at 5:26 AM, David Parks <[EMAIL PROTECTED]> wrote:

> I'm quite new to Pig, so perhaps my input is off base here, but if you
> input
> one such record without defining the schema I believe the JsonLoader will
> define the schema for you, no?  If so, just import one such record and
> 'describe' the variable to see the schema. Well... if it's not that easy,
> then I'm not the right one to answer, but maybe it gets you something
> quick.
> David
>
>
> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On
> Behalf Of Marian Steinbach
> Sent: Wednesday, May 22, 2013 4:13 PM
> To: [EMAIL PROTECTED]
> Subject: [Bulk] pig 0.10.0 JsonLoader and nested list
>
> I would like to load a JSON file containing records of the following
> format:
>
> {
>    "area": "ABC",
>    "date_day": 1,
>    "date_hour": 0,
>    ...
>    "energy": [["17-16", 1], ["18-17", 2]] }
>
> The "energy" property represents a sparse matrix. It's a list with an
> arbitrary number of key-value-pairs (minimum 1). The first element (string)
> is the matrix unit key, the second element is the value.
>
> I need both key and value in order to summarize values with matching keys
> in
> my pig job. I understand that it should be possible to import this as a
> bag.
> Correct?
>
> Can anybody tell me how the schema definition passed to the built-in
> JsonLoader function should look like?
>
> Thanks in advance!
>
> Marian
>
>
--
Wayne Zhu
847-282-0596 (Google Voice)