Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> How to design a data warehouse in HBase?


Copy link to this message
-
Re: How to design a data warehouse in HBase?
You need to spend a bit of time on Schema design.
You need to flatten your Schema...
Implement some secondary indexing to improve join performance...

Depends on what you want to do... There are other options too...

Sent from a remote device. Please excuse any typos...

Mike Segel

On Dec 13, 2012, at 7:09 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> For OLAP type queries you will generally be better off with a truly column oriented database.
> You can probably shoehorn HBase into this, but it wasn't really designed with raw scan performance along single columns in mind.
>
>
>
> ________________________________
> From: bigdata <[EMAIL PROTECTED]>
> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Sent: Wednesday, December 12, 2012 9:57 PM
> Subject: How to design a data warehouse in HBase?
>
> Dear all,
> We have a traditional star-model data warehouse in RDBMS, now we want to transfer it to HBase. After study HBase, I learn that HBase is normally can be query by rowkey.
> 1.full rowkey (fastest)2.rowkey filter (fast)3.column family/qualifier filter (slow)
> How can I design the HBase tables to implement the warehouse functions, like:1.Query by DimensionA2.Query by DimensionA and DimensionB3.Sum, count, distinct ...
> From my opinion, I should create several HBase tables with all combinations of different dimensions as the rowkey. This solution will lead to huge data duplication. Is there any good suggestions to solve it?
> Thanks a lot!