Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - schema design: rows vs wide columns


Copy link to this message
-
Re: schema design: rows vs wide columns
Michael Segel 2013-04-08, 11:17
StAck,

Just because FB does something doesn't mean its necessarily a good idea for others to do the same.  FB designs specifically for their needs and their use cases may not match those of others.

To your point though, I agree that Ted's number of 3 is more of a rule of thumb and not a hard and fast number. I think that the wording in that section should be changed.  (I may take a stab at it later today...)

In our HBase course, I teach an example of an Order entry system. (Order, Pick, Ship, Invoice) There are 4 column families in that example. To your point, in the use cases, the CFs are usually used in an atomic fashion. When I do a pick slip, I don't need to constantly reference the order, except when I initially create the Pick Slip(s).

The larger question in terms of design, should you use a CF to segment your data if you're constantly pulling data from both CFs in your main use case, or should they be part of the same table?

 -Mike

On Apr 7, 2013, at 5:45 PM, Stack <[EMAIL PROTECTED]> wrote:

> On Sun, Apr 7, 2013 at 3:27 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>
>> From http://hbase.apache.org/book.html#number.of.cfs :
>>
>> HBase currently does not do well with anything above two or three column
>> families so keep the number of column families in your schema low.
>>
>
> We should add more to that section.  FB run w/ ~15 and purportedly it works
> with appropriate write and query pattern.
> St.Ack