Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Composite Key Handling in Hbase + Hive Integration


Copy link to this message
-
Re: Composite Key Handling in Hbase + Hive Integration
Try something like this:

CREATE EXTERNAL TABLE hbase_table_1(key struct<a:string,b:string,c:string>,
value string)

ROW FORMAT DELIMITED

COLLECTION ITEMS TERMINATED BY '~'

STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

WITH SERDEPROPERTIES ("hbase.columns.mapping" ":key,test-family:test-qual")

TBLPROPERTIES ("hbase.table.name" = "SIMPLE_TABLE");

Basically what you are doing here is that you are visualizing the composite
key as a struct and specifying that your keys in the composite key are
separated by a "~". After doing this, to GROUP BY any key in your composite
key, you simply run a query like:

select * from hbase_table_2 GROUP BY key.a;

This should give you your desired result.

Let me know if this works for you. We can then add this as a workaround on
that bug.

On Tue, Jul 24, 2012 at 2:14 AM, ankit kinra <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I have a use case in HBase + Hive Integration where HBase primary key is a
> composite key and the keys is separated by us with a custom delimiter. So
> basically it is Key = A~B~C.
>  Now, I wanted to run a query on this HBase table using Hive and group by
> "A" (and not the complete primary key). I went through the following
> presentation :
>
> https://docs.google.com/viewer?a=v&q=cache:GHg9GMFOZVwJ:assets.en.oreilly.com/1/event/61/HBase%2520and%2520Hive%2520at%2520StumbleUpon%2520Presentation.ppt+hbase+composite+key+hive&hl=en&gl=us&pid=bl&srcid=ADGEEShTyoUXyvXptTu4pMjje_FkaN_j1OK9wG0lclWWsKNjGreLTkk3IDqT16xO8ClqIfzhM69aeU7Gph4kZPxTS-PXvLiWPSRvgS2WEjnvViPJhpM0ItsLaTWq1DRuUgOzKhjSzIlx&sig=AHIEtbT4scO3IdtvLYG3RtLoKN5gG1udPg
>
> It says that this was implemented at StumbleUpon, anybody having any idea
> if that can be used by others.
>
> Also, there is this issue in JIRA :
> https://issues.apache.org/jira/browse/HIVE-2599 which talks about similar
> feature.
>
> So it would be very helpful if anyone can give me some idea regarding this.
>
> Regards,
> Ankit Kinra
>
>
--
Swarnim
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB