Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Loading json files into hive table is giving NULL as output(data is in s3 bucket)


Copy link to this message
-
Re: Loading json files into hive table is giving NULL as output(data is in s3 bucket)
I don't know what changed. I didn't work on this SerDe myself. I'm glad
it's working now for you, though!

dean

On Mon, Feb 18, 2013 at 9:31 AM, Chunky Gupta <[EMAIL PROTECTED]>wrote:

> Hi Dean,
>
> I was using *hive-json-serde-0.2.jar* earlier. Now I tried *hive-json-serde-0.3.jar
> *as you suggested and it is working fine, I am getting the output as
> expected.
>
> Can you please tell me that what code change from 0.2 to 0.3 could have
> solved this problem ?
>
>
> Thanks,
> Chunky.
>
>
> On Mon, Feb 18, 2013 at 8:47 PM, Chunky Gupta <[EMAIL PROTECTED]>wrote:
>
>> Hi Dean,
>>
>> I tried with removing underscore too, and getting the same output which
>> means problem is not with underscore. Yes, it was an example.
>>
>> Actual json file is like :-
>>
>>
>> {"colnamec":"ColNametest","colnamets":"2013-01-14","colnameip":"10.10.10.10","colnameid":"10","colnameid2":"100","colnamep":0,"colnamecp":0,"colnamep":1,"colnameed":"31509","colnamesw":0,"colnamesu2":3,"colnameqq":"0","colnameppaa":0,"colnameqwe1":0,"colnamerty2":0,"colnameiop":"1000","colnamebnm":"23425253RFDSE","colnamefgh":2,"colnameagl":"","colnameyhgb":["1234","12345","2345","56789"],"colnamepoix":["12","4567","123","5678"],"colnamedswer":["100","567","123","678"],"colnamewerui":["10","10","10","10"]}
>>
>> I tried extracting one column only as I mentioned in last mail.
>>
>> There are values not in double quotes, some are null and some keys are
>> having multiple values.
>> Dean, is this json file correct for HIVE to handle it ?
>>
>> Thanks,
>> Chunky.
>>
>>
>>
>>
>>
>> On Mon, Feb 18, 2013 at 6:23 PM, Dean Wampler <
>> [EMAIL PROTECTED]> wrote:
>>
>>> The "uname="$._u" is the correct form. We also hacked on this SerDe at
>>> Think Big Analytics. I don't know if you'll see an improvement though.
>>>
>>> https://github.com/thinkbiganalytics/hive-json-serde
>>>
>>> I wonder if there's a problem handling the leading underscore?
>>>
>>> Also, I know it's just an example, but in case it was taken from a real
>>> situation, the dates in your example are for January.
>>>
>>> dean
>>>
>>> On Mon, Feb 18, 2013 at 6:43 AM, Chunky Gupta <[EMAIL PROTECTED]>wrote:
>>>
>>>> Hi,
>>>>
>>>> I have data in s3 bucket, which is in json format and is a zip file. I
>>>> have added this jar file in hive console :-
>>>>
>>>> http://code.google.com/p/hive-json-serde/downloads/detail?name=hive-json-serde-0.2.jar&can=2&q>>>>
>>>> I tried the following steps to create table and load data :-
>>>>
>>>> 1. CREATE EXTERNAL TABLE table_test ( uname STRING ) PARTITIONED BY (dt
>>>> STRING ) ROW FORMAT SERDE "org.apache.hadoop.hive.contrib.serde2.JsonSerde"
>>>> WITH SERDEPROPERTIES ( "uname"="$._u" ) LOCATION
>>>> 's3://BUCKET_NAME/test_data/'
>>>>
>>>>            I tried this also :-
>>>>
>>>> CREATE EXTERNAL TABLE table_test ( uname STRING ) PARTITIONED BY (dt
>>>> STRING ) ROW FORMAT SERDE "org.apache.hadoop.hive.contrib.serde2.JsonSerde"
>>>> WITH SERDEPROPERTIES ( "uname"="_u" ) LOCATION
>>>> 's3://BUCKET_NAME/test_data/'
>>>>
>>>>
>>>>
>>>> 2. alter table table_test add partition (dt='13Feb2012') location
>>>> 's3n://BUCKET_NAME/test_data/13Feb2012';
>>>>
>>>> and json file is like this :-
>>>> -------------------------------------
>>>> {"_u":"test_name1","_ts":"2012-01-13","_ip":"IP1"}
>>>> {"_u":"test_name2","_ts":"2012-01-13","_ip":"IP2"}
>>>> {"_u":"test_name3","_ts":"2012-01-13","_ip":"IP3"}
>>>>
>>>>
>>>> When I query :-
>>>> select uname from table_test;
>>>>
>>>> Output :-
>>>> NULL 13Feb2012
>>>> NULL 13Feb2012
>>>> NULL 13Feb2012
>>>>
>>>>
>>>> Please help me and let me know how to add json data in a table.
>>>>
>>>> Thanks,
>>>> Chunky.
>>>>
>>>
>>>
>>>
>>> --
>>> *Dean Wampler, Ph.D.*
>>> thinkbiganalytics.com
>>> +1-312-339-1330
>>>
>>>
>>
>
--
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330