|
Eli Finkelshteyn
2013-02-28, 18:44
Dmitriy Ryaboy
2013-02-28, 21:34
Harsha
2013-02-28, 21:44
Eli Finkelshteyn
2013-03-01, 19:40
Eli Finkelshteyn
2013-03-01, 22:37
Harsha
2013-03-02, 05:10
Harsha
2013-03-02, 05:17
Harsha
2013-03-02, 05:51
Eli Finkelshteyn
2013-03-04, 11:22
harsha ch
2013-03-04, 16:01
|
-
Parsing a Complex JSON String?Eli Finkelshteyn 2013-02-28, 18:44
Hi Folks,
I want to parse a string of complex JSON in Pig. Specifically, I want Pig to understand my JSON array as a bag instead of as a single chararray. When using JsonLoader, I can do this easily by specifying the schema, as in this question. Is there any way to either have Pig figure out my schema for me, or to specify it when Pig is parsing a string? I've been using JsonStringToMap, but can't find a way to specify Schema, or to have it properly understand my JSON array is an array and not a single char array. I looked at the code in JsonStringToMap, and it looks like it always specifies the schema for me as just a map of chararrays, which won't work for anything but the simplest JSON of a form like {string: string…}. Any ideas? Eli
-
Re: Parsing a Complex JSON String?Dmitriy Ryaboy 2013-02-28, 21:34
Does the EB json loader with
elephantbird.jsonloader.nestedLoad = true Work? On Thu, Feb 28, 2013 at 10:44 AM, Eli Finkelshteyn <[EMAIL PROTECTED]> wrote: > > Hi Folks, > > I want to parse a string of complex JSON in Pig. Specifically, I want Pig to understand my JSON array as a bag instead of as a single chararray. When using JsonLoader, I can do this easily by specifying the schema, as in this question. Is there any way to either have Pig figure out my schema for me, or to specify it when Pig is parsing a string? I've been using JsonStringToMap, but can't find a way to specify Schema, or to have it properly understand my JSON array is an array and not a single char array. I looked at the code in JsonStringToMap, and it looks like it always specifies the schema for me as just a map of chararrays, which won't work for anything but the simplest JSON of a form like {string: string…}. Any ideas? > > Eli
-
Re: Parsing a Complex JSON String?Harsha 2013-02-28, 21:44
Hi Eli,
Take a look at these https://github.com/mozilla-metrics/akela/tree/master/src/main/java/com/mozilla/pig/eval/json. We use it to parse a complex json objects. Thanks, Harsha On Thursday, February 28, 2013 at 10:44 AM, Eli Finkelshteyn wrote: > > Hi Folks, > > > I want to parse a string of complex JSON in Pig. Specifically, I want Pig to understand my JSON array as a bag instead of as a single chararray. When using JsonLoader, I can do this easily by specifying the schema, as in this question (http://stackoverflow.com/questions/14094768/parsing-complex-json-with-pig). Is there any way to either have Pig figure out my schema for me, or to specify it when Pig is parsing a string? I've been using JsonStringToMap, but can't find a way to specify Schema, or to have it properly understand my JSON array is an array and not a single char array. I looked at the code in JsonStringToMap, and it looks like it always specifies the schema for me as just a map of chararrays, which won't work for anything but the simplest JSON of a form like {string: string…}. Any ideas? > > Eli > > > > > Attachments: > - smime.p7s >
-
Re: Parsing a Complex JSON String?Eli Finkelshteyn 2013-03-01, 19:40
The JsonLoader works, but problem is I'm not loading a JSON file, but just trying to parse a json string as part of a bigger data set. That's why I needed to use JsonStringToMap.
On Feb 28, 2013, at 1:34 PM, Dmitriy Ryaboy wrote: > Does the EB json loader with > > elephantbird.jsonloader.nestedLoad = true > > Work? > > > > On Thu, Feb 28, 2013 at 10:44 AM, Eli Finkelshteyn <[EMAIL PROTECTED]> > wrote: >> >> Hi Folks, >> >> I want to parse a string of complex JSON in Pig. Specifically, I want Pig > to understand my JSON array as a bag instead of as a single chararray. When > using JsonLoader, I can do this easily by specifying the schema, as in this > question. Is there any way to either have Pig figure out my schema for me, > or to specify it when Pig is parsing a string? I've been using > JsonStringToMap, but can't find a way to specify Schema, or to have it > properly understand my JSON array is an array and not a single char array. > I looked at the code in JsonStringToMap, and it looks like it always > specifies the schema for me as just a map of chararrays, which won't work > for anything but the simplest JSON of a form like {string: string…}. Any > ideas? >> >> Eli
-
Re: Parsing a Complex JSON String?Eli Finkelshteyn 2013-03-01, 22:37
Hi Harsha,
Those functions look potentially awesome, but there doesn't seem to be much documentation on which to use for what. I've tried to parse my JSON with both JsonTupleMap and JsonMap, and get a com/fasterxml/jackson/core/JsonParseException with both… I was just running: grunt> REGISTER '/path/to/elephant-bird-pig-3.0.3-SNAPSHOT.jar'; grunt> REGISTER '/path/to/json-simple-1.1.1.jar'; grunt> REGISTER '/path/to/piggybank.jar'; grunt> REGISTER '/path/to/joda-time-2.1.jar'; grunt> REGISTER '/path/to/akela-0.5-SNAPSHOT.jar'; grunt> DEFINE JsonStringToMap com.twitter.elephantbird.pig.piggybank.JsonStringToMap(); grunt> DEFINE JsonTupleMap com.mozilla.pig.eval.json.JsonTupleMap(); grunt> grunt> loaded = LOAD '/path/to/test-files/*' AS (date:chararray, source:chararray, json_string:chararray); grunt> jsonified = FOREACH loaded GENERATE JsonTupleMap(json_string) AS json, date, source; 2013-03-01 14:28:29,485 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. com/fasterxml/jackson/core/JsonParseException Any ideas? Eli On Feb 28, 2013, at 1:44 PM, Harsha wrote: > Hi Eli, > Take a look at these > https://github.com/mozilla-metrics/akela/tree/master/src/main/java/com/mozilla/pig/eval/json. We use it to parse a complex json objects. > > Thanks, > Harsha > > > On Thursday, February 28, 2013 at 10:44 AM, Eli Finkelshteyn wrote: > >> >> Hi Folks, >> >> >> I want to parse a string of complex JSON in Pig. Specifically, I want Pig to understand my JSON array as a bag instead of as a single chararray. When using JsonLoader, I can do this easily by specifying the schema, as in this question (http://stackoverflow.com/questions/14094768/parsing-complex-json-with-pig). Is there any way to either have Pig figure out my schema for me, or to specify it when Pig is parsing a string? I've been using JsonStringToMap, but can't find a way to specify Schema, or to have it properly understand my JSON array is an array and not a single char array. I looked at the code in JsonStringToMap, and it looks like it always specifies the schema for me as just a map of chararrays, which won't work for anything but the simplest JSON of a form like {string: string…}. Any ideas? >> >> Eli >> >> >> >> >> Attachments: >> - smime.p7s >> > >
-
Re: Parsing a Complex JSON String?Harsha 2013-03-02, 05:10
Hi Eli,
I didn't encountered that issue with JsonMap or JsonMapTuple . We are using pig 0.9.2. Here are some example scripts https://github.com/mozilla-metrics/telemetry-toolbox/blob/master/src/main/pig/telemetry_aggregates.pig. you can look under pig dir for further examples. Can you just load akela-0.5-SNAPSHOT.jar without any additional jars I am just wondering if there are any other jars loading conflicting jackson versions. Thanks, Harsha On Friday, March 1, 2013 at 2:37 PM, Eli Finkelshteyn wrote: > Hi Harsha, > Those functions look potentially awesome, but there doesn't seem to be much documentation on which to use for what. I've tried to parse my JSON with both JsonTupleMap and JsonMap, and get a com/fasterxml/jackson/core/JsonParseException with both… I was just running: > > grunt> REGISTER '/path/to/elephant-bird-pig-3.0.3-SNAPSHOT.jar'; > grunt> REGISTER '/path/to/json-simple-1.1.1.jar'; > grunt> REGISTER '/path/to/piggybank.jar'; > grunt> REGISTER '/path/to/joda-time-2.1.jar'; > grunt> REGISTER '/path/to/akela-0.5-SNAPSHOT.jar'; > grunt> DEFINE JsonStringToMap com.twitter.elephantbird.pig.piggybank.JsonStringToMap(); > grunt> DEFINE JsonTupleMap com.mozilla.pig.eval.json.JsonTupleMap(); > grunt> > grunt> loaded = LOAD '/path/to/test-files/*' AS (date:chararray, source:chararray, json_string:chararray); > grunt> jsonified = FOREACH loaded GENERATE JsonTupleMap(json_string) AS json, date, source; > 2013-03-01 14:28:29,485 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. com/fasterxml/jackson/core/JsonParseException > > Any ideas? > > Eli > > On Feb 28, 2013, at 1:44 PM, Harsha wrote: > > > Hi Eli, > > Take a look at these > > https://github.com/mozilla-metrics/akela/tree/master/src/main/java/com/mozilla/pig/eval/json. We use it to parse a complex json objects. > > > > Thanks, > > Harsha > > > > > > On Thursday, February 28, 2013 at 10:44 AM, Eli Finkelshteyn wrote: > > > > > > > > Hi Folks, > > > > > > > > > I want to parse a string of complex JSON in Pig. Specifically, I want Pig to understand my JSON array as a bag instead of as a single chararray. When using JsonLoader, I can do this easily by specifying the schema, as in this question (http://stackoverflow.com/questions/14094768/parsing-complex-json-with-pig). Is there any way to either have Pig figure out my schema for me, or to specify it when Pig is parsing a string? I've been using JsonStringToMap, but can't find a way to specify Schema, or to have it properly understand my JSON array is an array and not a single char array. I looked at the code in JsonStringToMap, and it looks like it always specifies the schema for me as just a map of chararrays, which won't work for anything but the simplest JSON of a form like {string: string…}. Any ideas? > > > > > > Eli > > > > > > > > > > > > > > > Attachments: > > > - smime.p7s > > > > > > > > > >
-
Re: Parsing a Complex JSON String?Harsha 2013-03-02, 05:17
Hi Eli,
Just run a script with the latest code it does throw the jackson error. I'll be fixing it soon meanwhile you can pull up older version of code. Thanks, Harsha On Friday, March 1, 2013 at 9:10 PM, Harsha wrote: > Hi Eli, > I didn't encountered that issue with JsonMap or JsonMapTuple . We are using pig 0.9.2. Here are some example scripts > https://github.com/mozilla-metrics/telemetry-toolbox/blob/master/src/main/pig/telemetry_aggregates.pig. you can look under pig dir for further examples. Can you just load akela-0.5-SNAPSHOT.jar without any additional jars I am just wondering if there are any other jars loading conflicting jackson versions. > > Thanks, > Harsha > > > On Friday, March 1, 2013 at 2:37 PM, Eli Finkelshteyn wrote: > > > Hi Harsha, > > Those functions look potentially awesome, but there doesn't seem to be much documentation on which to use for what. I've tried to parse my JSON with both JsonTupleMap and JsonMap, and get a com/fasterxml/jackson/core/JsonParseException with both… I was just running: > > > > grunt> REGISTER '/path/to/elephant-bird-pig-3.0.3-SNAPSHOT.jar'; > > grunt> REGISTER '/path/to/json-simple-1.1.1.jar'; > > grunt> REGISTER '/path/to/piggybank.jar'; > > grunt> REGISTER '/path/to/joda-time-2.1.jar'; > > grunt> REGISTER '/path/to/akela-0.5-SNAPSHOT.jar'; > > grunt> DEFINE JsonStringToMap com.twitter.elephantbird.pig.piggybank.JsonStringToMap(); > > grunt> DEFINE JsonTupleMap com.mozilla.pig.eval.json.JsonTupleMap(); > > grunt> > > grunt> loaded = LOAD '/path/to/test-files/*' AS (date:chararray, source:chararray, json_string:chararray); > > grunt> jsonified = FOREACH loaded GENERATE JsonTupleMap(json_string) AS json, date, source; > > 2013-03-01 14:28:29,485 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. com/fasterxml/jackson/core/JsonParseException > > > > Any ideas? > > > > Eli > > > > On Feb 28, 2013, at 1:44 PM, Harsha wrote: > > > > > Hi Eli, > > > Take a look at these > > > https://github.com/mozilla-metrics/akela/tree/master/src/main/java/com/mozilla/pig/eval/json. We use it to parse a complex json objects. > > > > > > Thanks, > > > Harsha > > > > > > > > > On Thursday, February 28, 2013 at 10:44 AM, Eli Finkelshteyn wrote: > > > > > > > > > > > Hi Folks, > > > > > > > > > > > > I want to parse a string of complex JSON in Pig. Specifically, I want Pig to understand my JSON array as a bag instead of as a single chararray. When using JsonLoader, I can do this easily by specifying the schema, as in this question (http://stackoverflow.com/questions/14094768/parsing-complex-json-with-pig). Is there any way to either have Pig figure out my schema for me, or to specify it when Pig is parsing a string? I've been using JsonStringToMap, but can't find a way to specify Schema, or to have it properly understand my JSON array is an array and not a single char array. I looked at the code in JsonStringToMap, and it looks like it always specifies the schema for me as just a map of chararrays, which won't work for anything but the simplest JSON of a form like {string: string…}. Any ideas? > > > > > > > > Eli > > > > > > > > > > > > > > > > > > > > Attachments: > > > > - smime.p7s > > > > > > > > > > > > > > > > > > > >
-
Re: Parsing a Complex JSON String?Harsha 2013-03-02, 05:51
Hi Eli,
It looks like your script missing jackson dependencies. Add the following jars register 'jackson-core-2.0.6.jar' register 'jackson-databind-2.0.6.jar' register 'jackson-annotations-2.0.6.jar' Thanks, Harsha On Friday, March 1, 2013 at 9:17 PM, Harsha wrote: > Hi Eli, > Just run a script with the latest code it does throw the jackson error. I'll be fixing it soon meanwhile you can pull up older version of code. > Thanks, > Harsha > > > On Friday, March 1, 2013 at 9:10 PM, Harsha wrote: > > > Hi Eli, > > I didn't encountered that issue with JsonMap or JsonMapTuple . We are using pig 0.9.2. Here are some example scripts > > https://github.com/mozilla-metrics/telemetry-toolbox/blob/master/src/main/pig/telemetry_aggregates.pig. you can look under pig dir for further examples. Can you just load akela-0.5-SNAPSHOT.jar without any additional jars I am just wondering if there are any other jars loading conflicting jackson versions. > > > > Thanks, > > Harsha > > > > > > On Friday, March 1, 2013 at 2:37 PM, Eli Finkelshteyn wrote: > > > > > Hi Harsha, > > > Those functions look potentially awesome, but there doesn't seem to be much documentation on which to use for what. I've tried to parse my JSON with both JsonTupleMap and JsonMap, and get a com/fasterxml/jackson/core/JsonParseException with both… I was just running: > > > > > > grunt> REGISTER '/path/to/elephant-bird-pig-3.0.3-SNAPSHOT.jar'; > > > grunt> REGISTER '/path/to/json-simple-1.1.1.jar'; > > > grunt> REGISTER '/path/to/piggybank.jar'; > > > grunt> REGISTER '/path/to/joda-time-2.1.jar'; > > > grunt> REGISTER '/path/to/akela-0.5-SNAPSHOT.jar'; > > > grunt> DEFINE JsonStringToMap com.twitter.elephantbird.pig.piggybank.JsonStringToMap(); > > > grunt> DEFINE JsonTupleMap com.mozilla.pig.eval.json.JsonTupleMap(); > > > grunt> > > > grunt> loaded = LOAD '/path/to/test-files/*' AS (date:chararray, source:chararray, json_string:chararray); > > > grunt> jsonified = FOREACH loaded GENERATE JsonTupleMap(json_string) AS json, date, source; > > > 2013-03-01 14:28:29,485 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. com/fasterxml/jackson/core/JsonParseException > > > > > > Any ideas? > > > > > > Eli > > > > > > On Feb 28, 2013, at 1:44 PM, Harsha wrote: > > > > > > > Hi Eli, > > > > Take a look at these > > > > https://github.com/mozilla-metrics/akela/tree/master/src/main/java/com/mozilla/pig/eval/json. We use it to parse a complex json objects. > > > > > > > > Thanks, > > > > Harsha > > > > > > > > > > > > On Thursday, February 28, 2013 at 10:44 AM, Eli Finkelshteyn wrote: > > > > > > > > > > > > > > Hi Folks, > > > > > > > > > > > > > > > I want to parse a string of complex JSON in Pig. Specifically, I want Pig to understand my JSON array as a bag instead of as a single chararray. When using JsonLoader, I can do this easily by specifying the schema, as in this question (http://stackoverflow.com/questions/14094768/parsing-complex-json-with-pig). Is there any way to either have Pig figure out my schema for me, or to specify it when Pig is parsing a string? I've been using JsonStringToMap, but can't find a way to specify Schema, or to have it properly understand my JSON array is an array and not a single char array. I looked at the code in JsonStringToMap, and it looks like it always specifies the schema for me as just a map of chararrays, which won't work for anything but the simplest JSON of a form like {string: string…}. Any ideas? > > > > > > > > > > Eli > > > > > > > > > > > > > > > > > > > > > > > > > Attachments: > > > > > - smime.p7s > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
-
Re: Parsing a Complex JSON String?Eli Finkelshteyn 2013-03-04, 11:22
Hi Harsha,
I added those jars and everything works awesomely! Thanks! I'm still not sure why they're required though. The Akela pom.xml already requires Jackson as a dependency, so I figured everything needed would already be included in the akela jar. Why are those others required separately? Thanks again! Eli On Mar 1, 2013, at 9:51 PM, Harsha wrote: > Hi Eli, > It looks like your script missing jackson dependencies. Add the following jars > register 'jackson-core-2.0.6.jar' > register 'jackson-databind-2.0.6.jar' > register 'jackson-annotations-2.0.6.jar' > Thanks, > Harsha > > On Friday, March 1, 2013 at 9:17 PM, Harsha wrote: > >> Hi Eli, >> Just run a script with the latest code it does throw the jackson error. I'll be fixing it soon meanwhile you can pull up older version of code. >> Thanks, >> Harsha >> >> On Friday, March 1, 2013 at 9:10 PM, Harsha wrote: >> >>> Hi Eli, >>> I didn't encountered that issue with JsonMap or JsonMapTuple . We are using pig 0.9.2. Here are some example scripts >>> https://github.com/mozilla-metrics/telemetry-toolbox/blob/master/src/main/pig/telemetry_aggregates.pig. you can look under pig dir for further examples. Can you just load akela-0.5-SNAPSHOT.jar without any additional jars I am just wondering if there are any other jars loading conflicting jackson versions. >>> >>> Thanks, >>> Harsha >>> >>> On Friday, March 1, 2013 at 2:37 PM, Eli Finkelshteyn wrote: >>> >>>> Hi Harsha, >>>> Those functions look potentially awesome, but there doesn't seem to be much documentation on which to use for what. I've tried to parse my JSON with both JsonTupleMap and JsonMap, and get a com/fasterxml/jackson/core/JsonParseException with both… I was just running: >>>> >>>> grunt> REGISTER '/path/to/elephant-bird-pig-3.0.3-SNAPSHOT.jar'; >>>> grunt> REGISTER '/path/to/json-simple-1.1.1.jar'; >>>> grunt> REGISTER '/path/to/piggybank.jar'; >>>> grunt> REGISTER '/path/to/joda-time-2.1.jar'; >>>> grunt> REGISTER '/path/to/akela-0.5-SNAPSHOT.jar'; >>>> grunt> DEFINE JsonStringToMap com.twitter.elephantbird.pig.piggybank.JsonStringToMap(); >>>> grunt> DEFINE JsonTupleMap com.mozilla.pig.eval.json.JsonTupleMap(); >>>> grunt> >>>> grunt> loaded = LOAD '/path/to/test-files/*' AS (date:chararray, source:chararray, json_string:chararray); >>>> grunt> jsonified = FOREACH loaded GENERATE JsonTupleMap(json_string) AS json, date, source; >>>> 2013-03-01 14:28:29,485 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. com/fasterxml/jackson/core/JsonParseException >>>> >>>> Any ideas? >>>> >>>> Eli >>>> >>>> On Feb 28, 2013, at 1:44 PM, Harsha wrote: >>>> >>>>> Hi Eli, >>>>> Take a look at these >>>>> https://github.com/mozilla-metrics/akela/tree/master/src/main/java/com/mozilla/pig/eval/json. We use it to parse a complex json objects. >>>>> >>>>> Thanks, >>>>> Harsha >>>>> >>>>> >>>>> On Thursday, February 28, 2013 at 10:44 AM, Eli Finkelshteyn wrote: >>>>> >>>>>> >>>>>> Hi Folks, >>>>>> >>>>>> >>>>>> I want to parse a string of complex JSON in Pig. Specifically, I want Pig to understand my JSON array as a bag instead of as a single chararray. When using JsonLoader, I can do this easily by specifying the schema, as in this question (http://stackoverflow.com/questions/14094768/parsing-complex-json-with-pig). Is there any way to either have Pig figure out my schema for me, or to specify it when Pig is parsing a string? I've been using JsonStringToMap, but can't find a way to specify Schema, or to have it properly understand my JSON array is an array and not a single char array. I looked at the code in JsonStringToMap, and it looks like it always specifies the schema for me as just a map of chararrays, which won't work for anything but the simplest JSON of a form like {string: string…}. Any ideas? >>>>>> >>>>>> Eli >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Attachments: >>>>>> - smime.p7s >>> >> >
-
Re: Parsing a Complex JSON String?harsha ch 2013-03-04, 16:01
Hi Eli,
With maven dependencies are not copied to the target jar On Mon, Mar 4, 2013 at 3:22 AM, Eli Finkelshteyn <[EMAIL PROTECTED]> wrote: > Hi Harsha, > I added those jars and everything works awesomely! Thanks! I'm still not sure why they're required though. The Akela pom.xml already requires Jackson as a dependency, so I figured everything needed would already be included in the akela jar. Why are those others required separately? > Thanks again! > Eli > On Mar 1, 2013, at 9:51 PM, Harsha wrote: >> Hi Eli, >> It looks like your script missing jackson dependencies. Add the following jars >> register 'jackson-core-2.0.6.jar' >> register 'jackson-databind-2.0.6.jar' >> register 'jackson-annotations-2.0.6.jar' >> Thanks, >> Harsha >> >> On Friday, March 1, 2013 at 9:17 PM, Harsha wrote: >> >>> Hi Eli, >>> Just run a script with the latest code it does throw the jackson error. I'll be fixing it soon meanwhile you can pull up older version of code. >>> Thanks, >>> Harsha >>> >>> On Friday, March 1, 2013 at 9:10 PM, Harsha wrote: >>> >>>> Hi Eli, >>>> I didn't encountered that issue with JsonMap or JsonMapTuple . We are using pig 0.9.2. Here are some example scripts >>>> https://github.com/mozilla-metrics/telemetry-toolbox/blob/master/src/main/pig/telemetry_aggregates.pig. you can look under pig dir for further examples. Can you just load akela-0.5-SNAPSHOT.jar without any additional jars I am just wondering if there are any other jars loading conflicting jackson versions. >>>> >>>> Thanks, >>>> Harsha >>>> >>>> On Friday, March 1, 2013 at 2:37 PM, Eli Finkelshteyn wrote: >>>> >>>>> Hi Harsha, >>>>> Those functions look potentially awesome, but there doesn't seem to be much documentation on which to use for what. I've tried to parse my JSON with both JsonTupleMap and JsonMap, and get a com/fasterxml/jackson/core/JsonParseException with both… I was just running: >>>>> >>>>> grunt> REGISTER '/path/to/elephant-bird-pig-3.0.3-SNAPSHOT.jar'; >>>>> grunt> REGISTER '/path/to/json-simple-1.1.1.jar'; >>>>> grunt> REGISTER '/path/to/piggybank.jar'; >>>>> grunt> REGISTER '/path/to/joda-time-2.1.jar'; >>>>> grunt> REGISTER '/path/to/akela-0.5-SNAPSHOT.jar'; >>>>> grunt> DEFINE JsonStringToMap com.twitter.elephantbird.pig.piggybank.JsonStringToMap(); >>>>> grunt> DEFINE JsonTupleMap com.mozilla.pig.eval.json.JsonTupleMap(); >>>>> grunt> >>>>> grunt> loaded = LOAD '/path/to/test-files/*' AS (date:chararray, source:chararray, json_string:chararray); >>>>> grunt> jsonified = FOREACH loaded GENERATE JsonTupleMap(json_string) AS json, date, source; >>>>> 2013-03-01 14:28:29,485 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. com/fasterxml/jackson/core/JsonParseException >>>>> >>>>> Any ideas? >>>>> >>>>> Eli >>>>> >>>>> On Feb 28, 2013, at 1:44 PM, Harsha wrote: >>>>> >>>>>> Hi Eli, >>>>>> Take a look at these >>>>>> https://github.com/mozilla-metrics/akela/tree/master/src/main/java/com/mozilla/pig/eval/json. We use it to parse a complex json objects. >>>>>> >>>>>> Thanks, >>>>>> Harsha >>>>>> >>>>>> >>>>>> On Thursday, February 28, 2013 at 10:44 AM, Eli Finkelshteyn wrote: >>>>>> >>>>>>> >>>>>>> Hi Folks, >>>>>>> >>>>>>> >>>>>>> I want to parse a string of complex JSON in Pig. Specifically, I want Pig to understand my JSON array as a bag instead of as a single chararray. When using JsonLoader, I can do this easily by specifying the schema, as in this question (http://stackoverflow.com/questions/14094768/parsing-complex-json-with-pig). Is there any way to either have Pig figure out my schema for me, or to specify it when Pig is parsing a string? I've been using JsonStringToMap, but can't find a way to specify Schema, or to have it properly understand my JSON array is an array and not a single char array. I looked at the code in JsonStringToMap, and it looks like it always specifies the schema for me as just a map of chararrays, which won't work for anything but the simplest JSON of a form like {string: string…}. Any ideas? |