Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Regarding Rowkey and Column Family


Copy link to this message
-
Re: Regarding Rowkey and Column Family
Jean-Marc Spaggiari 2012-12-24, 14:24
Hi Rams,

How are you going to access you data?

HBase will create one cell (Which mean rowkey+timestamp+...+data) for
eache cell.

Are you really going to sometime access Address Line1 without
accessing Address Line2?

Are you really going to access the City wihtout accessing the State?

If not, why not just put a JSon object with all this data in a single cell?

So at the end your table will look llike:

*Table Name : Customer*
*
*
*Field Name         Column Family*
Customer Information CF1
Address CF1
In Customer Information you bundle:
Customer Number      CF1
DOB                  CF1
FName                CF1
MName                CF1
LName                CF1

And in Address you bundle:
Address Type         CF2
Address Line1        CF2
Address Line2        CF2
Address Line3        CF2
Address Line4        CF2
State                CF2
City                 CF2
Country              CF2

But if you always access the address when you access the customer
information, then the best way might be to just put all those field in
a single JSon object, and have just one CF and on C in your table...

Regarding the key, if you customer number is sequential and you insert
based on this field, you will hotspot one server at a time... If the
number is "random", then it's ok.

HTH.

JM

2012/12/24, Mohammad Tariq <[EMAIL PROTECTED]>:
> it is. but why do you want  to do that? you will run into issues once your
> data starts growing. each cell, along with the actual value stores few
> additional things, *row, column *and the *version. *as a result you will
> loose space if you do that.
>
> Best Regards,
> Tariq
> +91-9741563634
> https://mtariq.jux.com/
>
>
> On Mon, Dec 24, 2012 at 5:00 PM, Ramasubramanian Narayanan <
> [EMAIL PROTECTED]> wrote:
>
>> Hi,
>>
>> Is it ok to have same column into different column familes?
>>
>> regards,
>> Rams
>>
>> On Mon, Dec 24, 2012 at 4:06 PM, Mohammad Tariq <[EMAIL PROTECTED]>
>> wrote:
>>
>> > you are creating 2 different rows here. cf means how column are clubbed
>> > together as a single entity which is represented by that cf. but here
>> > you
>> > are creating 2 different rows having one cf each, CF1 and CF2
>> respectively.
>> > if you want to have 1 row with 2 cf, you have to do use same rowkey for
>> > both the cf.
>> >
>> >
>> >
>> > Best Regards,
>> > Tariq
>> > +91-9741563634
>> > https://mtariq.jux.com/
>> >
>> >
>> > On Mon, Dec 24, 2012 at 3:41 PM, Ramasubramanian Narayanan <
>> > [EMAIL PROTECTED]> wrote:
>> >
>> > > Hi,
>> > >
>> > > *Table Name : Customer*
>> > > *
>> > > *
>> > > *Field Name         Column Family*
>> > > Customer Number      CF1
>> > > DOB                  CF1
>> > > FName                CF1
>> > > MName                CF1
>> > > LName                CF1
>> > > Address Type         CF2
>> > > Address Line1        CF2
>> > > Address Line2        CF2
>> > > Address Line3        CF2
>> > > Address Line4        CF2
>> > > State                CF2
>> > > City                 CF2
>> > > Country              CF2
>> > >
>> > > Is it good to have rowkey as follows for the same table?
>> > >
>> > > Rowkey Design:
>> > > --------------
>> > > For CF1 : Customer Number + YYYYMMD (business date)
>> > > For CF2 : Customer Number + Address Type
>> > >
>> > > Note :
>> > > Address Type can be any of HOME/OFFICE/OTHERS
>> > >
>> > > regards,
>> > > Rams
>> > >
>> >
>>
>