Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Hive select shows null after successful data load


+
Sunita Arvind 2013-06-18, 18:24
+
Nitin Pawar 2013-06-18, 18:26
+
Sunita Arvind 2013-06-18, 20:29
+
Nitin Pawar 2013-06-18, 20:35
+
Sunita Arvind 2013-06-18, 22:58
+
Stephen Sprague 2013-06-19, 00:38
+
Sunita Arvind 2013-06-19, 01:35
+
Sunita Arvind 2013-06-19, 01:58
+
Richa Sharma 2013-06-19, 06:17
+
Sunita Arvind 2013-06-19, 11:34
+
Stephen Sprague 2013-06-19, 14:38
+
Sunita Arvind 2013-06-19, 15:24
+
Ramki Palle 2013-06-19, 16:11
+
Sunita Arvind 2013-06-19, 17:00
Copy link to this message
-
Re: Hive select shows null after successful data load
try_parsed_json is not trivial imho :)

start with the very, very basic, for example,  { "jobs" : "foo" }.  Get
that to work first. :)   When that works add a level of nesting and see
what happens.  Keep building on it until you either break it (and then you
know that last thing you added broke it and can concentrate on that) or
you'll have worked out all the bugs and your final example will work.
Nothing fancy here except old school trial and error.

An alternative I keep bringing up when native semantics don't go one's way
is the transform() function.  use python, perl,  ruby or whatever to parse
the json and go nuts with the rich features of said language.  just write
your output to stdout as a delimited serialization of what you want to
store and that's it.  That would be another way to get your scalars, arrays
and structs to work.

Don't give up yet though on the JsonSerde! :)  Its probably something very
easy that we just can't see.
On Wed, Jun 19, 2013 at 10:00 AM, Sunita Arvind <[EMAIL PROTECTED]>wrote:

> Thanks for looking into it Ramki.
> Yes I had tried these options. Here is what I get (renamed the table to
> have a meaningful name):
>
> hive> select jobs.values[1].id from linkedinjobsearch;
> ......mapreduce task details....
> OK
> NULL
> Time taken: 9.586 seconds
>
>
> hive> select jobs.values[0].position.title from linkedinjobsearch;
>  Total MapReduce jobs = 1
> Launching Job 1 out of 1
>
> OK
> NULL
> Time taken: 9.617 seconds
>
>
> I am trying to connect btrace to the process to be able to trace the code
> but cant get it to respond. Here is what I tried:
>
> [sunita@node01 ~]$ hive --debug, recursive=y, port=7000,mainSuspend=y,
> childSuspend=y
> ERROR: Cannot load this JVM TI agent twice, check your java command line
> for duplicate jdwp options.
> Error occurred during initialization of VM
> agent library failed to init: jdwp
>
> Tried changing the port also. Any idea regarding the debuggers that can be
> used. I also tried explain query and that does not show any issues either.
>
> regards
> Sunita
>
>
>
>
>
>
>
> On Wed, Jun 19, 2013 at 12:11 PM, Ramki Palle <[EMAIL PROTECTED]>wrote:
>
>> Can you run some other queries from job1 table and see if any query
>> returns some data?
>>
>> I am guessing your query "select jobs.values.position.title from jobs1;"
>> may have some issue. May be it should be as
>>
>> select jobs.values[0].position.title from jobs1;
>>
>>
>> Regards,
>> Ramki.
>>
>>
>> On Wed, Jun 19, 2013 at 8:24 AM, Sunita Arvind <[EMAIL PROTECTED]>wrote:
>>
>>> Thanks Stephen,
>>>
>>> That's just what I tried with the try_parsed table. It is exactly same
>>> data with lesser nesting in the structure and lesser number of entries.
>>> Do you mean to say that highly nested jsons can lead to issues? What are
>>> typical solution to such issues? Write UDFs in hive or parse the JSON into
>>> a delimited file?
>>> I have heard of custom serdes also. Not sure if UDFs and custom serdes
>>> are one and the same.
>>>
>>> regards
>>> Sunita
>>>
>>>
>>> On Wed, Jun 19, 2013 at 10:38 AM, Stephen Sprague <[EMAIL PROTECTED]>wrote:
>>>
>>>> I think you might have to start small here instead of going for the
>>>> home run on the first swing.  when all else fails start with a trivial json
>>>> object and then build up from there and see what additional step breaks
>>>> it.   that way you know if the trivial example fails is something
>>>> fundamental and not the complexity of your json object that's throwing
>>>> things off.
>>>>
>>>>
>>>> On Wed, Jun 19, 2013 at 4:34 AM, Sunita Arvind <[EMAIL PROTECTED]>wrote:
>>>>
>>>>> Thanks for sharing your experience Richa.
>>>>> I do have timestamps but in the format of year : INT, day : INT, month
>>>>> : INT.
>>>>> As per your suggestion, I changed them all to string, but still get
>>>>> null as the output.
>>>>>
>>>>> regards
>>>>> Sunita
>>>>>
>>>>>
>>>>> On Wed, Jun 19, 2013 at 2:17 AM, Richa Sharma <
>>>>> [EMAIL PROTECTED]> wrote:
>>>>>
>
+
Sunita Arvind 2013-06-19, 19:29
+
Sunita Arvind 2013-06-20, 02:54
+
Stephen Sprague 2013-06-20, 14:19
+
Sunita Arvind 2013-06-21, 08:42