Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> How do I load JSON in Pig?


Copy link to this message
-
Re: How do I load JSON in Pig?
Try

com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad')
This should allow access to nested object as nested map ($0#'level1#'level2'#'level3' …)

David

On Nov 21, 2012, at 12:56 AM, Saxifrage Cucvara <[EMAIL PROTECTED]> wrote:

> I'm also experiencing problems working with JSON objects in Pig.
>
> I have managed to load in a log file in JSON format but only query the top
> level objects.  Whenever I try to call anything that is nested it fails.
>
> -- Register JARS
> register elephant-bird-2.2.3.jar;
> register json-simple-1.1.jar;
>
> -- Load data
> nestobject = LOAD '/Users/Path/GoogleDrive/test.json'
>        USING
> com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad=true')
>        AS (json:map[]);
> DUMP nestobject;
>
> -- Example query
> tester = FOREACH nestobject GENERATE json#'event',json#'uid',
> json#'data'#'expired_reason' as reason;
> DUMP tester;
>
> The above fails ...
>
> Does anyone have any ideas?
>
> Thanks
>
> Sax
>
> On 20 November 2012 07:22, Deepak Tiwari <[EMAIL PROTECTED]> wrote:
>
>> I also ran into same dilemma..here is something that I found easier and
>> working for me .. I compiled some sources from http://www.json.org/java/
>>
>>
>> import java.io.IOException;
>> import java.io.UnsupportedEncodingException;
>> import java.util.List;
>>
>> import org.apache.pig.EvalFunc;
>> import org.apache.pig.data.Tuple;
>> import org.apache.pig.data.TupleFactory;
>> import org.json.JSONArray;
>> import org.json.JSONException;
>> import org.json.JSONObject;
>>
>>
>> public class JsonParser extends EvalFunc<Tuple> {
>>    @Override
>>    public Tuple exec(Tuple input) throws IOException {
>>        TupleFactory tf = TupleFactory.getInstance();
>>        Tuple t = tf.newTuple();
>>
>>
>>        if ( input.get(0) != null ){
>>            String inString = (String) input.get(0);
>>            try {
>>                JSONObject jsn = new JSONObject(inString);
>>                t.append(getJsonArr(jsn));
>>                    } catch (JSONException e) {
>>
>>                e.printStackTrace();
>>
>>            }
>>        }
>>        return t;
>>    }
>>
>>    private String getJsonArr(JSONObject jsn) {
>>        String jsnArrVal = "";
>>
>>        try {
>>            if (!jsn.has("jsonKey"))
>>                return null;
>>            JSONArray jTagArray = jsn.getJSONArray("jsonKey");
>>            for (int i=0; i<jTagArray.length(); i++){
>>                JSONObject hst = jTagArray.getJSONObject(i);
>>                String jsnArrVal = hst.getString("text") + jsnArrVal;
>>            }
>>        } catch (JSONException e) {
>>            // TODO Auto-generated catch block
>>            e.printStackTrace();
>>        }
>>        return jsnArrVal;
>>    }
>> }
>>
>>
>> On Mon, Nov 19, 2012 at 11:35 AM, Russell Jurney
>> <[EMAIL PROTECTED]>wrote:
>>
>>> Ok, its even worse. My data is a big array.
>>>
>>> Am I being negative in saying that JSON and Pig is like a nightmare?
>>>
>>>
>>> On Mon, Nov 19, 2012 at 2:33 PM, Russell Jurney <
>> [EMAIL PROTECTED]
>>>> wrote:
>>>
>>>> Wait... com.twitter.elephantbird.pig.load.JsonLoader() does not infer
>> the
>>>> schema from a record. This is what I was looking for. Looks like I have
>>> to
>>>> write that myself.
>>>>
>>>> And yes, I understand the tradeoffs in doing so. Assuming a sample is
>> the
>>>> overall schema is a big assumption.
>>>>
>>>>
>>>>
>>>> On Mon, Nov 19, 2012 at 2:30 PM, Russell Jurney <
>>> [EMAIL PROTECTED]>wrote:
>>>>
>>>>> Talking to myself... never mind, guava and json-simple are included
>> with
>>>>> Pig.
>>>>>
>>>>>
>>>>> On Mon, Nov 19, 2012 at 2:27 PM, Russell Jurney <
>>> [EMAIL PROTECTED]
>>>>>> wrote:
>>>>>
>>>>>> Got it building. Are google collections and json-simple external
>> deps?
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 19, 2012 at 11:23 AM, Russell Jurney <
>>>>>> [EMAIL PROTECTED]> wrote:
>>>>>>
>>>>>>> It seems that everyone can build elephant-bird but me: