Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - hive to hbase mapping


+
Mario Casola 2013-06-14, 15:54
+
Sanjay Subramanian 2013-06-14, 16:21
+
Mario Casola 2013-06-14, 16:54
+
kulkarni.swarnim@... 2013-06-15, 01:43
+
Mario Casola 2013-06-17, 13:00
Copy link to this message
-
Re: hive to hbase mapping
Sanjay Subramanian 2013-06-18, 06:54
How about you have two streams - one to hbase and one to Hive fro your data generation source ?

Moving data out of Hbase may not be trivial specially if the data sizes are largeā€¦.
From: Mario Casola <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Reply-To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: Friday, June 14, 2013 9:54 AM
To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Subject: Re: hive to hbase mapping

Hi Sanjay,

thanks for the response.

I need Hbase because is perfect for aggregating data through the counters, and write performance is great.
Now the problem is...Which is the best way for loading periodically (every hour for example) Hbase data in Hive table?

Mario

2013/6/14 Sanjay Subramanian <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
6 months back I was tasked with building a Data platform for logs and I benchmarked
Hbase + Hive (queries were 8X slower)
Hive only

So I decided for Hive option and am deploying that solution to production.

Couple of things u can think while u design if u really want to go HBase+Hive (also look at this http://hadoopstack.com/hive-on-hbase-part-1/)
- Query only todays data in a Hive+Hbase architecture
- Older data than one day query Hive only

Hope I am not diverting from your question and problem

sanjay

From: Mario Casola <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Reply-To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: Friday, June 14, 2013 8:54 AM
To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Subject: hive to hbase mapping

Hi,

I have a performance issue when I query HBase from Hive.
My idea is to build the scenario below:
1. Collect data in hbase for aggregation purpose
2. Create an external table that map Hive to Hbase
3. Create a real Hive table
4. Periodically transfer data from hbase to Hive through "INSERTO INTO <real hive table> SELECT * FROM <external table> WHERE time = 201305212909"

Currently I'm doing a test on a Hbase table that has 70,000,000 rows and I'm trying to query this table with a single column value filter, like the query above.
If I try this type of query directly in Hbase the response time is around 80 seconds.
If I try the query in Hive shell, after 30 minutes, all the tasks (9 in my case) are 0,00% complete.

Which could be the problem?

thanks
Mario

CONFIDENTIALITY NOTICE
=====================This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.
CONFIDENTIALITY NOTICE
=====================This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.