Hi Dalia
A sample DDL would be something like this.
CREATE EXTERNAL TABLE employee(key string,no string,name string,address map<string,string>)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,no:NUM,name:FIRST,address:")
TBLPROPERTIES("hbase.table.name"= "employee");
points to be noted
-if you are mapping an entire Hbase Column Family to a hive column, then hive column data type should be a map
-if the mapper is done with hbase CF:Qualifier then you can have other non collection data types like STRING
-the order of values given in hbase.columns.mapping should be in correspondence with the hive column order
For complete reference
https://cwiki.apache.org/Hive/hbaseintegration.htmlHope it helps!...
Regards
Bejoy.K.S
________________________________
From: Dalia Sobhy <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Thursday, February 2, 2012 3:17 PM
Subject: RE: Important Question
Hiii Bejoy,
Can you provide me with a simple example of how to mount Hbase table into a Hive Table ??
Thanks,
________________________________
Date: Wed, 25 Jan 2012 08:58:26 -0800
From: [EMAIL PROTECTED]
Subject: Re: Important Question
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Hi Dalia
By complex queries if you are looking at joins with multiple tables and so on, Hbase doesn't support joins. In the absence of joins if you want to achieve a join that involved multiple tables in RDBMS, based on your requirement you should find suitable Column Families and Qualifiers in single Hbase table to accommodate those multiple tables in RDBMS. I haven't played much with HBQL, but if you are developing some API you c an depend on the HBase Java API internally for storage and retrieval of records. Hbase the Querying (Retrieval time) largely depends on how you design the Row key and Column family (Hbase stores CF together and Row Keys sorted and distributed across regions). If you want to have a SQL like querying functionality for a Hbase table you have to correspondingly mount that to a hive table.
In my personal experience I have used hbase tables for real time data storage and retrieval for a hadoop enterprise application. There were scheduled Map Reduce jobs that run on off peak hours that dumps the required data (formatted and filtered) from this Hbase table into hdfs and from there hive consumes the data for analytical purposes. We had a good number of analytical jobs and didn't wanted to choke hbase servers in peak hours so the mining and analytics part were moved completely to hive.
Regards
Bejoy.K.S
________________________________
From: Dalia Sobhy <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Sent: Wednesday, January 25, 2012 10:00 PM
Subject: Re: Important Question
So what about HBQL??
And if i had complex queries would i get stuck with HBase?
Also can anyone provide me with examples of a table in RDBMS transformed into hbase, realtime query and analytical processing..
Sent from my iPhone
On 2012-01-25, at 6:15 PM, [EMAIL PROTECTED] wrote:
> Real Time.. Definitely not hive. Go in for HBase, but don't expect Hbase to be as flexible as RDBMS. You need to choose your Row Key and Column Families wisely as per your requirements.
> For data mining and analytics you can mount Hive table over corresponding Hbase table and play on with SQL like queries.
>
>
>
> Regards
> Bejoy K S
>
> -----Original Message-----
> From: Dalia Sobhy <[EMAIL PROTECTED]>
> Date: Wed, 25 Jan 2012 17:01:08
> To: <us [EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
> Reply-To: [EMAIL PROTECTED]
> Subject: Important Question
>
>
> Dear all,
> I am developing an API for medical use i.e Hospital admissions and all about patients, thus transactions and queries and realtime data is important here...
> Therefore both real-time and analytical processing is a must..