Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> schema design: rows vs wide columns


Copy link to this message
-
Re: schema design: rows vs wide columns
StAck,

Just because FB does something doesn't mean its necessarily a good idea for others to do the same.  FB designs specifically for their needs and their use cases may not match those of others.

To your point though, I agree that Ted's number of 3 is more of a rule of thumb and not a hard and fast number. I think that the wording in that section should be changed.  (I may take a stab at it later today...)

In our HBase course, I teach an example of an Order entry system. (Order, Pick, Ship, Invoice) There are 4 column families in that example. To your point, in the use cases, the CFs are usually used in an atomic fashion. When I do a pick slip, I don't need to constantly reference the order, except when I initially create the Pick Slip(s).

The larger question in terms of design, should you use a CF to segment your data if you're constantly pulling data from both CFs in your main use case, or should they be part of the same table?

 -Mike

On Apr 7, 2013, at 5:45 PM, Stack <[EMAIL PROTECTED]> wrote:

> On Sun, Apr 7, 2013 at 3:27 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>
>> From http://hbase.apache.org/book.html#number.of.cfs :
>>
>> HBase currently does not do well with anything above two or three column
>> families so keep the number of column families in your schema low.
>>
>
> We should add more to that section.  FB run w/ ~15 and purportedly it works
> with appropriate write and query pattern.
> St.Ack
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB