Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Json and split into multiple files


+
Mohit Anchlia 2012-09-05, 03:37
+
Mohit Anchlia 2012-09-05, 19:04
Copy link to this message
-
Re: Json and split into multiple files
Loading the JSON below should give you a Pig record like:
(user: tuple(id: int, name: chararray), product: tuple(id: int, name:chararray))

In that case your Pig Latin would look like:

A = load 'inputfile' using JsonLoader() as (user: tuple(id: int, name: chararray), product: tuple(id: int, name:chararray))
B = foreach A generate user.id, user.name;
store B into 'userfile';
C = foreach A generate product.id, product.name;
store C info 'productfile';

I'm not sure what key is, so I'm not sure the above is what you're thinking or not.

Alan.

On Sep 5, 2012, at 12:04 PM, Mohit Anchlia wrote:

> Any pointers would be appreciated
>
> On Tue, Sep 4, 2012 at 8:37 PM, Mohit Anchlia <[EMAIL PROTECTED]>wrote:
>
>> I have a Json something like:
>>
>> {
>> user{
>> id : 1
>> name: user1
>> }
>> product {
>> id: 1
>> name: product1
>> }
>> }
>>
>> I want to be able to read this file and create 2 files as follows:
>>
>> user file:
>> key,1,user1
>>
>> product file:
>> key,1,product1
>>
>> I know I need to call exec but the method will return Bags for each of
>> these dimensions.  But since it's all unordered how do I split it further
>> to write them to separate files?
>>
+
Mohit Anchlia 2012-09-06, 16:32
+
Mohit Anchlia 2012-09-07, 17:21
+
Alan Gates 2012-09-13, 02:51
+
Mohit Anchlia 2012-09-13, 14:01
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB