Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Using JSON Data with Hive


Copy link to this message
-
Re: Using JSON Data with Hive
Hi Michael,
There's lots of possibilities with Hive and JSON (as well as with HBase too
I'm sure.)

Since you are just starting out by the sounds of it i - and this is only my
opinion - would recommend to store your however deeply nested json objects
one per line in a hive table.

Then look at the json_tuple() and get_json_object() hive udf functions to
pull values out and do what you want with them.

After you've become used to the hive paradigm and all that then you might
want to look at using something called a "JsonSerde" to map the json
directly to columns in a table but like i said try the way i described
above first to familiarize yourself with the Hive.

Ya gotta crawl first before you run.
On Fri, Jun 7, 2013 at 12:24 AM, Michael Duergner | Pockets United GmbH <
[EMAIL PROTECTED]> wrote:

> Hi there,
>
> I'm looking if we can use Hive to run our usage analytics; our system
> right now collects data from our clients in JSON format which results in
> multiple files per client (every time analytics events are uploaded to the
> server a new file is created) which is in JSON format; each file has one
> JSON array with multiple JSON objects representing the actual analytics
> events.
>
> From what I understood from the docs so far, Hive should be able to with
> with JSON data; the only difference our data has compared to the data I saw
> in several examples is, that the actual entries are inside an array instead
> of being single lines in this file.
>
> Can I process them directly or do I need to write some custom code to
> transform the input data?
>
> Thanks
> Michael
>  *___________________________*
> *Michael Dürgner*
> Founder & CTO
> Pockets United GmbH****
>
> email      [EMAIL PROTECTED]
> phone      +49 89 2155 6166-1
> mobile    +49 151 42 31 46 40 (time: CET/UTC+1h)
> mail        Dachauerstr. 241, 80637 Munich, Germany****
> office      Wayra Akademie, Kaufingerstr. 15, 80331 Munich, Germany
>
> www.pocketsunited.com****
> *
> **Split Costs, Share Fun!*****
>
> Managing Directors: Michael Duergner, Matthias Schicker und Markus Stiefel
> Location and Municipal Court: Munich HRB 192066
> VAT: DE277893196
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB