Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> schema design: rows vs wide columns


+
shawn du 2013-04-07, 08:03
+
Ted 2013-04-07, 18:58
+
Stack 2013-04-07, 22:04
+
Ted Yu 2013-04-07, 22:27
+
Andrew Purtell 2013-04-07, 22:52
+
Viral Bajaria 2013-04-07, 23:51
+
ramkrishna vasudevan 2013-04-08, 03:59
+
lars hofhansl 2013-04-08, 04:39
+
ramkrishna vasudevan 2013-04-08, 04:51
+
Doug Meil 2013-04-08, 14:21
+
Ted Yu 2013-04-16, 14:02
+
Jean-Marc Spaggiari 2013-04-16, 14:04
+
Ted Yu 2013-04-16, 14:08
+
Michael Segel 2013-04-16, 14:35
+
Adrien Mogenet 2013-04-28, 15:23
+
Stack 2013-04-07, 22:45
Copy link to this message
-
Re: schema design: rows vs wide columns
StAck,

Just because FB does something doesn't mean its necessarily a good idea for others to do the same.  FB designs specifically for their needs and their use cases may not match those of others.

To your point though, I agree that Ted's number of 3 is more of a rule of thumb and not a hard and fast number. I think that the wording in that section should be changed.  (I may take a stab at it later today...)

In our HBase course, I teach an example of an Order entry system. (Order, Pick, Ship, Invoice) There are 4 column families in that example. To your point, in the use cases, the CFs are usually used in an atomic fashion. When I do a pick slip, I don't need to constantly reference the order, except when I initially create the Pick Slip(s).

The larger question in terms of design, should you use a CF to segment your data if you're constantly pulling data from both CFs in your main use case, or should they be part of the same table?

 -Mike

On Apr 7, 2013, at 5:45 PM, Stack <[EMAIL PROTECTED]> wrote:

> On Sun, Apr 7, 2013 at 3:27 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>
>> From http://hbase.apache.org/book.html#number.of.cfs :
>>
>> HBase currently does not do well with anything above two or three column
>> families so keep the number of column families in your schema low.
>>
>
> We should add more to that section.  FB run w/ ~15 and purportedly it works
> with appropriate write and query pattern.
> St.Ack
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB