Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Mapping HBase table in Hive

Copy link to this message
Re: Mapping HBase table in Hive
bejoy_ks@... 2013-01-08, 18:35
Hi Ibrahim

The hive hbase integration totally depends on the hbase table schema and not the schema of the source table in mysql.

You need to provide the column family qualifier mapping in there.

Get the hbase table's schema from hbase shell.

suppose you have the schema as

You need to match each of these ColumnFamily:Qualifier to corresponding columns in hive.

So in hbase.columns.mapping you need to provide these CF:QL in order.

If you need to map a full CF to a hive column, the data type of the hive column should be a Map.

You can get detailed hbase to hive integration document from hive wiki .
Bejoy KS

Sent from remote device, Please excuse typos

-----Original Message-----
From: Ibrahim Yakti <[EMAIL PROTECTED]>
Date: Tue, 8 Jan 2013 15:45:32
Subject: Mapping HBase table in Hive


suppose I have the following table (orders) in MySQL:

*************************** 1. row ***************************
  Field: id
   Type: int(10) unsigned
   Null: NO
    Key: PRI
Default: NULL
  Extra: auto_increment
*************************** 2. row ***************************
  Field: value
   Type: int(10) unsigned
   Null: NO
Default: NULL
*************************** 3. row ***************************
  Field: date_lastchange
   Type: timestamp
   Null: NO
  Extra: on update CURRENT_TIMESTAMP
*************************** 4. row ***************************
  Field: date_inserted
   Type: timestamp
   Null: NO
Default: 0000-00-00 00:00:00

I imported it into HBase with column family "id"

I want to create an external table in Hive to query the HBase table, I am
not able to get the mapping parameters (*hbase.columns.mapping*), it is
confusing, if anybody can explain it to me please. I used the following

CREATE EXTERNAL TABLE hbase_orders(id bigint, value bigint, date_lastchange
string, date_inserted string) STORED BY
'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES
("hbase.columns.mapping" = " ? ? ? ? ? ?") TBLPROPERTIES ("hbase.table.name"
= "orders");

Is there any way to build the Hive tables automatically or I should go with
the same process with each table?
Thanks in advanced.