Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> (PigJsonLoader) how to read/load json with Pig?


Copy link to this message
-
Re: (PigJsonLoader) how to read/load json with Pig?
For some reason, I always thought there is a JSONLoader in Piggybank.
Seems like there is none. Kim, it would be great if you can contribute
yours..

Ashutosh
On Tue, Sep 28, 2010 at 09:45, Kim Vogt <[EMAIL PROTECTED]> wrote:
> Here's mine:
>
> http://gist.github.com/601331
>
> Pretty much the same as the LZO one minus the LZO stuff.  Works with pig
> 0.7.
>
> -Kim
>
> On Mon, Sep 27, 2010 at 9:59 PM, Benny Sadeh <[EMAIL PROTECTED]> wrote:
>
>> loading/reading json for Pig processing sounds like a common useful
>> functionality.
>>
>> however, I have not found any implementation for such.
>>
>> (and yes, I know of Elephant Bird, which reads LZO-compressed json (but not
>> regular json))
>>
>>
>> but I did see a reference in the "Hadoop Training: Introduction to Pig" (
>> http://www.cloudera.com/videos/introduction_to_pig)
>>
>> within the downloadable IntroToPig.pdf, where  there is a mention of
>> PigJsonLoader
>>
>> however, there is no such UDF within the piggybank source of
>> the cloudera distributed vm, or within any other piggybank jar out there
>> that I have seen.
>>
>> so I wonder, where can I find a pig json reader/loader that can accomplish
>> the equivalent of: A = LOAD ‘data.json’ USING PigJsonLoader();
>>
>> ???
>>
>>
>> any pointeres would be greatly appreciated ...
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB