Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Reading json file.


Copy link to this message
-
Re: Reading json file.
Hi,

There are different json loaders available, but none of them worked for me
when I had to deal with json. I ended up loading the file as text file,
reading one line at a time and then I parsed json inside my UDF with a json
java library

Best Regards,
Ruslan
On Fri, Aug 30, 2013 at 2:53 AM, jamal sasha <[EMAIL PROTECTED]> wrote:

> Umm.. I am trying .. but somehow i am not able to get my head around this:
> a = load 'sample_json.json' using
> JsonLoader('id:chararray,categories:[chararray], hostt:{ (variable_a:
> {(first:int,last:int)})}, ns:[chararray],rep:chararray  ');
>
> But i get this error:
> org.codehaus.jackson.JsonParseException: Unexpected character ('D' (code
> 68)): expected a valid value (number, String, array, object, 'true',
> 'false' or 'null')
>  at [Source: java.io.ByteArrayInputStream@4795b8e9; line: 1, column: 50]
> at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1291)
> at
>
> org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:385)
> at
>
> org.codehaus.jackson.impl.JsonParserMinimalBase._reportUnexpectedChar(JsonParserMinimalBase.java:306)
> at
>
> org.codehaus.jackson.impl.Utf8StreamParser._handleUnexpectedValue(Utf8StreamParser.java:1582)
> at
>
> org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:386)
> at org.apache.pig.builtin.JsonLoader.readField(JsonLoader.java:173)
> at org.apache.pig.builtin.JsonLoader.getNext(JsonLoader.java:157)
> at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
> at
>
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
> at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
>
>
>
> On Thu, Aug 29, 2013 at 3:22 PM, Shahab Yunus <[EMAIL PROTECTED]
> >wrote:
>
> > Have you seen these?
> >
> >
> http://pig.apache.org/docs/r0.11.0/api/org/apache/pig/builtin/JsonStorage.html
> >
> > http://hortonworks.com/blog/jsonize-anything-in-pig-with-tojson/
> >
> > Regards,
> > Shahab
> >
> >
> > On Thu, Aug 29, 2013 at 6:19 PM, jamal sasha <[EMAIL PROTECTED]>
> > wrote:
> >
> > > Hi,
> > >
> > > I have json file in follwoing format:
> > > { "_id" : "foo.com", "categories" : [], "h1" : { "bar==" : { "first" :
> > > 1281916800, "last" : 1316995200 }, "foo==" : { "first" : 1281916800,
> > "last"
> > > : 1316995200 } }, "name2" : [ "foobarl.com", "foobar2.com" ], "rep" :
> > > null }
> > > So, how do i parse this json in pig..
> > >
> > > also, the categories and rep can have some char in it..and might not be
> > > always empty.
> > >
> > > Thanks
> > >
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB