Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> column based or row based storage for HBase?

Copy link to this message
Re: column based or row based storage for HBase?
On Sun, Aug 5, 2012 at 6:04 AM, Lin Ma <[EMAIL PROTECTED]> wrote:

> Hi guys,
> I am wondering whether HBase is using column based storage or row based
> storage?
>    - I read some technical documents and mentioned advantages of HBase is
>    using column based storage to store similar data together to foster
>    compression. So it means same columns of different rows are stored
> together;
Probably what you read was in context of Column Families. HBase has concept
of column family similar to Google's bigtable. And the store files on disk
is per column family. All columns of a given column family are in one store
file and columns of different column family is a different file.
>    - But I also learned HBase is a sorted key-value map in underlying
>    HFile. It uses key to address all related columns for that key (row),
> so it
>    seems to be a row based storage?
HBase stores entire row together along with columns represented by
KeyValue. This is also called cell in HBase.
> It is appreciated if anyone could clarify my confusions. Any related
> documents or code for more details are welcome.
> thanks in advance,
> Lin