Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Loading json files into hive table is giving NULL as output(data is in s3 bucket)


Copy link to this message
-
Re: Loading json files into hive table is giving NULL as output(data is in s3 bucket)
Hi Dean,

I was using *hive-json-serde-0.2.jar* earlier. Now I tried
*hive-json-serde-0.3.jar
*as you suggested and it is working fine, I am getting the output as
expected.

Can you please tell me that what code change from 0.2 to 0.3 could have
solved this problem ?
Thanks,
Chunky.

On Mon, Feb 18, 2013 at 8:47 PM, Chunky Gupta <[EMAIL PROTECTED]>wrote:

> Hi Dean,
>
> I tried with removing underscore too, and getting the same output which
> means problem is not with underscore. Yes, it was an example.
>
> Actual json file is like :-
>
>
> {"colnamec":"ColNametest","colnamets":"2013-01-14","colnameip":"10.10.10.10","colnameid":"10","colnameid2":"100","colnamep":0,"colnamecp":0,"colnamep":1,"colnameed":"31509","colnamesw":0,"colnamesu2":3,"colnameqq":"0","colnameppaa":0,"colnameqwe1":0,"colnamerty2":0,"colnameiop":"1000","colnamebnm":"23425253RFDSE","colnamefgh":2,"colnameagl":"","colnameyhgb":["1234","12345","2345","56789"],"colnamepoix":["12","4567","123","5678"],"colnamedswer":["100","567","123","678"],"colnamewerui":["10","10","10","10"]}
>
> I tried extracting one column only as I mentioned in last mail.
>
> There are values not in double quotes, some are null and some keys are
> having multiple values.
> Dean, is this json file correct for HIVE to handle it ?
>
> Thanks,
> Chunky.
>
>
>
>
>
> On Mon, Feb 18, 2013 at 6:23 PM, Dean Wampler <
> [EMAIL PROTECTED]> wrote:
>
>> The "uname="$._u" is the correct form. We also hacked on this SerDe at
>> Think Big Analytics. I don't know if you'll see an improvement though.
>>
>> https://github.com/thinkbiganalytics/hive-json-serde
>>
>> I wonder if there's a problem handling the leading underscore?
>>
>> Also, I know it's just an example, but in case it was taken from a real
>> situation, the dates in your example are for January.
>>
>> dean
>>
>> On Mon, Feb 18, 2013 at 6:43 AM, Chunky Gupta <[EMAIL PROTECTED]>wrote:
>>
>>> Hi,
>>>
>>> I have data in s3 bucket, which is in json format and is a zip file. I
>>> have added this jar file in hive console :-
>>>
>>> http://code.google.com/p/hive-json-serde/downloads/detail?name=hive-json-serde-0.2.jar&can=2&q>>>
>>> I tried the following steps to create table and load data :-
>>>
>>> 1. CREATE EXTERNAL TABLE table_test ( uname STRING ) PARTITIONED BY (dt
>>> STRING ) ROW FORMAT SERDE "org.apache.hadoop.hive.contrib.serde2.JsonSerde"
>>> WITH SERDEPROPERTIES ( "uname"="$._u" ) LOCATION
>>> 's3://BUCKET_NAME/test_data/'
>>>
>>>            I tried this also :-
>>>
>>> CREATE EXTERNAL TABLE table_test ( uname STRING ) PARTITIONED BY (dt
>>> STRING ) ROW FORMAT SERDE "org.apache.hadoop.hive.contrib.serde2.JsonSerde"
>>> WITH SERDEPROPERTIES ( "uname"="_u" ) LOCATION
>>> 's3://BUCKET_NAME/test_data/'
>>>
>>>
>>>
>>> 2. alter table table_test add partition (dt='13Feb2012') location
>>> 's3n://BUCKET_NAME/test_data/13Feb2012';
>>>
>>> and json file is like this :-
>>> -------------------------------------
>>> {"_u":"test_name1","_ts":"2012-01-13","_ip":"IP1"}
>>> {"_u":"test_name2","_ts":"2012-01-13","_ip":"IP2"}
>>> {"_u":"test_name3","_ts":"2012-01-13","_ip":"IP3"}
>>>
>>>
>>> When I query :-
>>> select uname from table_test;
>>>
>>> Output :-
>>> NULL 13Feb2012
>>> NULL 13Feb2012
>>> NULL 13Feb2012
>>>
>>>
>>> Please help me and let me know how to add json data in a table.
>>>
>>> Thanks,
>>> Chunky.
>>>
>>
>>
>>
>> --
>> *Dean Wampler, Ph.D.*
>> thinkbiganalytics.com
>> +1-312-339-1330
>>
>>
>