Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - Mapping existing HBase table with many columns to Hive.


+
David Koch 2012-12-06, 18:56
Copy link to this message
-
Re: Mapping existing HBase table with many columns to Hive.
kulkarni.swarnim@...) 2012-12-06, 20:10
Hi David,

First of all, you columns are not "long". They are binary as well.
Currently as hive stands, there is no support for binary qualifiers.
However, I recently submitted a patch for that[1]. Feel free to give it a
shot and let me know if you see any issues. With that patch, you can
directly give your qualifiers to hive as they look here (
\x00\x00\x01;2\xE6Q\x06).

Until then, the only option you have is to use a map to map all your
columns under the column family "t". An example to do that would be:
CREATE EXTERNAL TABLE hbase_table_1(key int, value map<string,string>)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,t:")
TBLPROPERTIES("hbase.table.name" = "some_existing_table");
Also as far as your key goes, it is a composite key. There is also an
existing patch for the support of that here[2].
Hope that helps.
[1] https://issues.apache.org/jira/browse/HIVE-3553
[2] https://issues.apache.org/jira/browse/HIVE-2599
On Thu, Dec 6, 2012 at 12:56 PM, David Koch <[EMAIL PROTECTED]> wrote:

> Hello,
>
> How can I map an HBase table with the following layout to Hive using the
> "CREATE EXTERNAL TABLE" command from shell (or another programmatic way):
>
> The HBase table's layout is as follows:
> Rowkey=16 bytes, a UUID that had the "-" removed, and the 32hex chars
> converted into two 8byte longs.
> Columns (qualifiers): timestamps, i.e the bytes of a long which were
> converted using Hadoop's Bytes.toBytes(long). There can be many of those in
> a single row.
> Values: The bytes of a Java string.
>
> I am unsure of which datatypes to use. I am pretty sure there is no way I
> can sensible map the row key to anything other than "binary" but maybe the
> columns - which are longs and the values which are strings can be mapped to
> their according Hive datatypes.
>
> I include an extract of what a row looks like in HBase shell below:
>
> Thank you,
>
> /David
>
> hbase(main):009:0> scan "hits"
> ROW
>                 COLUMN+CELL
>
> \x00\x00\x06\xB1H\x89N\xC3\xA5\x83\x0F\xDD\x1E\xAE&\xDC
>  column=t:\x00\x00\x01;2\xE6Q\x06, timestamp=1267737987733, value=blahaha
> \x00\x00\x06\xB1H\x89N\xC3\xA5\x83\x0F\xDD\x1E\xAE&\xDC
>  column=t:\x00\x00\x01;2\xE6\xFB@, timestamp=1354012104967,
> value=testtest
>

--
Swarnim
+
David Koch 2012-12-06, 20:23
+
David Koch 2012-12-10, 00:03
+
kulkarni.swarnim@...) 2012-12-10, 00:52