Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Very poor read performance with composite keys in hbase


Copy link to this message
-
Re: Very poor read performance with composite keys in hbase
Can you show your query that is taking 700 seconds?
On Tue, Apr 30, 2013 at 12:48 PM, Rupinder Singh <[EMAIL PROTECTED]> wrote:

>  Hi,****
>
> ** **
>
> I have an hbase cluster where I have a table with a composite key. I map
> this table to a Hive external table using which I insert/select data
> into/from this table:****
>
> CREATE EXTERNAL TABLE event(key
> struct<name:string,dateCreated:string,uid:string>, {more columns here})***
> *
>
> ROW FORMAT DELIMITED****
>
> COLLECTION ITEMS TERMINATED BY '~'****
>
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'****
>
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key, other columns ")***
> *
>
> TBLPROPERTIES ("hbase.table.name" = "event");****
>
> ** **
>
> The table has about 10 million rows. When I do a select * using all 3
> components of the key, essentially selecting just 1 row, the response time
> is almost 700 sec, which seems pretty bad.****
>
> ** **
>
> For comparison purpose, I created another table with a simple string key,
> and the rest of the columns etc same. The key is a string UUID. Table has
> same number of column families and same number of rows.****
>
> CREATE EXTERNAL TABLE test_event(key string, blah blah…..****
>
> TBLPROPERTIES ("hbase.table.name" = "test_event");****
>
> ** **
>
> When I select a single row from this table by doing select * where
> key=’something’, the response time is 35 sec.****
>
> ** **
>
> This seems to indicate that in case of composite keys, there is a full
> table scan happening.  This seems weird.****
>
> ** **
>
> What am I missing here? Is there something special I need to do to get
> good read performance if I am using composite keys ?****
>
> Insert performance in both cases is comparable and is as per expectation.*
> ***
>
> ** **
>
> Any help is appreciated.****
>
> Here is the env spec:****
>
> ** **
>
> Amazon EMR****
>
> Hbase Cluster- 3 core nodes with 7.5 GB RAM each, 2 CPUs of 2.2 GHz each.
> Master 7.5 GB RAM, 2 CPUs of 2.2 GHz each****
>
> Hive Cluster – 3 core nodes 3.75 GB RAM each, 1 CPU of 1.8 GHz. Master
> 3.75 GB RAM, 1 CPU of 1.8 GHz****
>
> ** **
>
> Thanks****
>
> Rupinder****
>
>
> This email is intended for the person(s) to whom it is addressed and may
> contain information that is PRIVILEGED or CONFIDENTIAL. Any unauthorized
> use, distribution, copying, or disclosure by any person other than the
> addressee(s) is strictly prohibited. If you have received this email in
> error, please notify the sender immediately by return email and delete the
> message and any attachments from your system.
>
>
--
Swarnim
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB