Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Json Loader - Array of objects - Loading results in empty data set


Copy link to this message
-
Re: Json Loader - Array of objects - Loading results in empty data set
I haven't worked with JsonLoader much, so I'm not sure what the problem is.
But your schema looks correct for your JSON structure now.

DataBSets is an Array (or Bag) of Objects (or Tuples). Each Object (or
Tuple) inside the Array has one key which maps to an Object(or Tuple) with
two keys. This is exactly what you would want the structure to look like in
pig.

```
Grunt > describe b;
b: {DataASet: (A1: int,A2: int,DataBSets: {tuple_0: (DataBSet: (B1:
chararray,B2: chararray))})}
grunt> dump b;
()
grunt>
```

I know that lots of people have been having problems with JsonLoader in the
past. I can recall off-hand several emails over the past year on this
mailing list complaining about the loader. Most of the recommendations,
remembering off the top of my head, have been to use the Elephant bird
version of the Loader.

I'm not sure what the version conflict you're seeing with cdh +
elephant-bird, but I'd recommend compiling elephant-bird with the correct
version of hadoop + pig that you're using and deploy it to your maven repo.
I myself do this so that I know that all the code is compiled against
correct version that we're running in house.

I'm going to look into this problem a little bit more and see if I can get
it to work without elephant-bird.
On Fri, Aug 8, 2014 at 8:44 AM, Klüber, Ralf <[EMAIL PROTECTED]>
wrote: