Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Very poor read performance with composite keys in hbase


Copy link to this message
-
Re: Very poor read performance with composite keys in hbase
Can you show your query that is taking 700 seconds?
On Tue, Apr 30, 2013 at 12:48 PM, Rupinder Singh <[EMAIL PROTECTED]> wrote:

>  Hi,****
>
> ** **
>
> I have an hbase cluster where I have a table with a composite key. I map
> this table to a Hive external table using which I insert/select data
> into/from this table:****
>
> CREATE EXTERNAL TABLE event(key
> struct<name:string,dateCreated:string,uid:string>, {more columns here})***
> *
>
> ROW FORMAT DELIMITED****
>
> COLLECTION ITEMS TERMINATED BY '~'****
>
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'****
>
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key, other columns ")***
> *
>
> TBLPROPERTIES ("hbase.table.name" = "event");****
>
> ** **
>
> The table has about 10 million rows. When I do a select * using all 3
> components of the key, essentially selecting just 1 row, the response time
> is almost 700 sec, which seems pretty bad.****
>
> ** **
>
> For comparison purpose, I created another table with a simple string key,
> and the rest of the columns etc same. The key is a string UUID. Table has
> same number of column families and same number of rows.****
>
> CREATE EXTERNAL TABLE test_event(key string, blah blah…..****
>
> TBLPROPERTIES ("hbase.table.name" = "test_event");****
>
> ** **
>
> When I select a single row from this table by doing select * where
> key=’something’, the response time is 35 sec.****
>
> ** **
>
> This seems to indicate that in case of composite keys, there is a full
> table scan happening.  This seems weird.****
>
> ** **
>
> What am I missing here? Is there something special I need to do to get
> good read performance if I am using composite keys ?****
>
> Insert performance in both cases is comparable and is as per expectation.*
> ***
>
> ** **
>
> Any help is appreciated.****
>
> Here is the env spec:****
>
> ** **
>
> Amazon EMR****
>
> Hbase Cluster- 3 core nodes with 7.5 GB RAM each, 2 CPUs of 2.2 GHz each.
> Master 7.5 GB RAM, 2 CPUs of 2.2 GHz each****
>
> Hive Cluster – 3 core nodes 3.75 GB RAM each, 1 CPU of 1.8 GHz. Master
> 3.75 GB RAM, 1 CPU of 1.8 GHz****
>
> ** **
>
> Thanks****
>
> Rupinder****
>
>
> This email is intended for the person(s) to whom it is addressed and may
> contain information that is PRIVILEGED or CONFIDENTIAL. Any unauthorized
> use, distribution, copying, or disclosure by any person other than the
> addressee(s) is strictly prohibited. If you have received this email in
> error, please notify the sender immediately by return email and delete the
> message and any attachments from your system.
>
>
--
Swarnim