Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Json and split into multiple files


Copy link to this message
-
Re: Json and split into multiple files
It looks like I can use outputSchema(Schema input) call to do this. But
examples I see are only for one tuple. In my case if I am reading it right
I need tuple for each dimension and hence schema for each. For instance
there'll be one user tuple and then product tuple for instance. So I need
schema for each.

How can I do this using outputSchema such that result is like below where I
can access each tuple and field that is a named field? Thanks for your help

 A = load 'inputfile' using JsonLoader() as (user: tuple(id: int, name:
chararray), product: tuple(id: int, name:chararray))

On Tue, Sep 4, 2012 at 8:37 PM, Mohit Anchlia <[EMAIL PROTECTED]>wrote:

> I have a Json something like:
>
> {
> user{
>  id : 1
> name: user1
>  }
> product {
> id: 1
> name: product1
> }
> }
>
> I want to be able to read this file and create 2 files as follows:
>
> user file:
> key,1,user1
>
> product file:
> key,1,product1
>
> I know I need to call exec but the method will return Bags for each of
> these dimensions.  But since it's all unordered how do I split it further
> to write them to separate files?
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB