|
S Ahmed
2010-07-08, 05:26
Jeff Zhang
2010-07-08, 05:34
Ryan Rawson
2010-07-08, 05:46
Akash Deep Shakya
2010-07-08, 05:48
Amandeep Khurana
2010-07-08, 07:07
Otis Gospodnetic
2010-07-31, 04:03
Amandeep Khurana
2010-07-31, 18:50
Time Less
2010-08-18, 18:11
Ryan Rawson
2010-08-18, 18:17
Edward Capriolo
2010-08-18, 19:23
|
-
major differences with CassandraS Ahmed 2010-07-08, 05:26
Hello!
I was hoping some has experiences with both Cassandra and HBase. What are the major differences between Cassandra and HBase? Does HBase have the concept of ColumnFamilies and SuperColumnFamilies like Cassandra? Where in the wiki does it go over designing a data model? thanks!
-
Re: major differences with CassandraJeff Zhang 2010-07-08, 05:34
HBase do not have super column family.
And I can list the following major difference between hbase and cassandra ( welcome any supplement) : 1. HBase is master-slave architecture, while cassandra has no master, and you can consider it as p2p structure, and it has no single point of failure. 2. HBase is strong consistency while cassandra is eventual consistency (although you can tune it to be strong consistency) On Thu, Jul 8, 2010 at 1:26 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > Hello! > > I was hoping some has experiences with both Cassandra and HBase. > > What are the major differences between Cassandra and HBase? > > Does HBase have the concept of ColumnFamilies and SuperColumnFamilies like > Cassandra? > > Where in the wiki does it go over designing a data model? > > > thanks! > -- Best Regards Jeff Zhang
-
Re: major differences with CassandraRyan Rawson 2010-07-08, 05:46
Hi,
Cassandra is based (loosely) on the Dynamo paper from Amazon. HBase is based on the bigtable paper from Google. There are numerous architectural differences and many practical ones. I'd have a look at both of those papers to get a sense of how they differ. As for data modeling questions, there has been many email threads, and there are a few relevant things in here: http://wiki.apache.org/hadoop/HBase/HBasePresentations -ryan On Wed, Jul 7, 2010 at 10:34 PM, Jeff Zhang <[EMAIL PROTECTED]> wrote: > HBase do not have super column family. > > And I can list the following major difference between hbase and cassandra ( > welcome any supplement) : > > 1. HBase is master-slave architecture, while cassandra has no master, and > you can consider it as p2p structure, and it has no single point of failure. > 2. HBase is strong consistency while cassandra is eventual consistency > (although you can tune it to be strong consistency) > > > On Thu, Jul 8, 2010 at 1:26 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > >> Hello! >> >> I was hoping some has experiences with both Cassandra and HBase. >> >> What are the major differences between Cassandra and HBase? >> >> Does HBase have the concept of ColumnFamilies and SuperColumnFamilies like >> Cassandra? >> >> Where in the wiki does it go over designing a data model? >> >> >> thanks! >> > > > > -- > Best Regards > > Jeff Zhang >
-
Re: major differences with CassandraAkash Deep Shakya 2010-07-08, 05:48
I studied Cassandra in detail and currently working in hbase. There are lots
of differences, yet similarities between Cassandra and HBase, for HBase data model/arch, there are two articles from lars george which might be really helpful to you, http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-log.html Have a great time with hbase Regards Akash Deep Shakya "OpenAK" FOSS Nepal Community akashakya at gmail dot com ~ Failure to prepare is preparing to fail ~ On Thu, Jul 8, 2010 at 11:19 AM, Jeff Zhang <[EMAIL PROTECTED]> wrote: > HBase do not have super column family. > > And I can list the following major difference between hbase and cassandra ( > welcome any supplement) : > > 1. HBase is master-slave architecture, while cassandra has no master, and > you can consider it as p2p structure, and it has no single point of > failure. > 2. HBase is strong consistency while cassandra is eventual consistency > (although you can tune it to be strong consistency) > > > On Thu, Jul 8, 2010 at 1:26 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > > > Hello! > > > > I was hoping some has experiences with both Cassandra and HBase. > > > > What are the major differences between Cassandra and HBase? > > > > Does HBase have the concept of ColumnFamilies and SuperColumnFamilies > like > > Cassandra? > > > > Where in the wiki does it go over designing a data model? > > > > > > thanks! > > > > > > -- > Best Regards > > Jeff Zhang >
-
Re: major differences with CassandraAmandeep Khurana 2010-07-08, 07:07
Another link: http://bit.ly/aGJi1e
On Wed, Jul 7, 2010 at 10:48 PM, Akash Deep Shakya <[EMAIL PROTECTED]>wrote: > I studied Cassandra in detail and currently working in hbase. There are > lots > of differences, yet similarities between Cassandra and HBase, for HBase > data > model/arch, there are two articles from lars george which might be really > helpful to you, > http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html > > http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-log.html > > Have a great time with hbase > > Regards > Akash Deep Shakya "OpenAK" > FOSS Nepal Community > akashakya at gmail dot com > > ~ Failure to prepare is preparing to fail ~ > > > > On Thu, Jul 8, 2010 at 11:19 AM, Jeff Zhang <[EMAIL PROTECTED]> wrote: > > > HBase do not have super column family. > > > > And I can list the following major difference between hbase and cassandra > ( > > welcome any supplement) : > > > > 1. HBase is master-slave architecture, while cassandra has no master, and > > you can consider it as p2p structure, and it has no single point of > > failure. > > 2. HBase is strong consistency while cassandra is eventual consistency > > (although you can tune it to be strong consistency) > > > > > > On Thu, Jul 8, 2010 at 1:26 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > > > > > Hello! > > > > > > I was hoping some has experiences with both Cassandra and HBase. > > > > > > What are the major differences between Cassandra and HBase? > > > > > > Does HBase have the concept of ColumnFamilies and SuperColumnFamilies > > like > > > Cassandra? > > > > > > Where in the wiki does it go over designing a data model? > > > > > > > > > thanks! > > > > > > > > > > > -- > > Best Regards > > > > Jeff Zhang > > >
-
Re: major differences with CassandraOtis Gospodnetic 2010-07-31, 04:03
I don't have the URL handy, but just the other day I read some Cassandra/HBase
blog post where Cassandra was described as having no SPOF, but somebody left some very "strong comments" calling out that and a few other claims as false. Ah, I remember, here is the URL: http://blog.mozilla.com/data/2010/05/18/riak-and-cassandra-and-hbase-oh-my/ Otis---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search :: http://search-hadoop.com/ ----- Original Message ---- > From: Jeff Zhang <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Sent: Thu, July 8, 2010 1:34:18 AM > Subject: Re: major differences with Cassandra > > HBase do not have super column family. > > And I can list the following major difference between hbase and cassandra ( > welcome any supplement) : > > 1. HBase is master-slave architecture, while cassandra has no master, and > you can consider it as p2p structure, and it has no single point of failure. > 2. HBase is strong consistency while cassandra is eventual consistency > (although you can tune it to be strong consistency) > > > On Thu, Jul 8, 2010 at 1:26 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > > > Hello! > > > > I was hoping some has experiences with both Cassandra and HBase. > > > > What are the major differences between Cassandra and HBase? > > > > Does HBase have the concept of ColumnFamilies and SuperColumnFamilies like > > Cassandra? > > > > Where in the wiki does it go over designing a data model? > > > > > > thanks! > > > > > > -- > Best Regards > > Jeff Zhang >
-
Re: major differences with CassandraAmandeep Khurana 2010-07-31, 18:50
More info at http://blog.amandeepkhurana.com/2010/05/comparing-pnuts-hbase-and-cassandra.html
-Amandeep Sent from my iPhone On Jul 30, 2010, at 9:03 PM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > I don't have the URL handy, but just the other day I read some Cassandra/HBase > blog post where Cassandra was described as having no SPOF, but somebody left > some very "strong comments" calling out that and a few other claims as false. > Ah, I remember, here is the URL: > > http://blog.mozilla.com/data/2010/05/18/riak-and-cassandra-and-hbase-oh-my/ > > > Otis---- > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Hadoop ecosystem search :: http://search-hadoop.com/ > > > > ----- Original Message ---- >> From: Jeff Zhang <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED] >> Sent: Thu, July 8, 2010 1:34:18 AM >> Subject: Re: major differences with Cassandra >> >> HBase do not have super column family. >> >> And I can list the following major difference between hbase and cassandra ( >> welcome any supplement) : >> >> 1. HBase is master-slave architecture, while cassandra has no master, and >> you can consider it as p2p structure, and it has no single point of failure. >> 2. HBase is strong consistency while cassandra is eventual consistency >> (although you can tune it to be strong consistency) >> >> >> On Thu, Jul 8, 2010 at 1:26 PM, S Ahmed <[EMAIL PROTECTED]> wrote: >> >>> Hello! >>> >>> I was hoping some has experiences with both Cassandra and HBase. >>> >>> What are the major differences between Cassandra and HBase? >>> >>> Does HBase have the concept of ColumnFamilies and SuperColumnFamilies like >>> Cassandra? >>> >>> Where in the wiki does it go over designing a data model? >>> >>> >>> thanks! >>> >> >> >> >> -- >> Best Regards >> >> Jeff Zhang >>
-
Re: major differences with CassandraTime Less 2010-08-18, 18:11
HBase is run by persons who understand (or are willing to hear) the
operational requirements of distributed databases in high-volume environments, whereas the Cassandra project isn't. Talks about technical differences are really noise, because they're entirely theoretical. When viewed with this knowledge, a lot of the disagreements, flamewars, and shoutfests begin to make sense. As of today, I'm unaware of any major feature Cassandra claims that it actually delivers outside of installations run by the developers themselves. Specifically: multi-DC, hinted handoff, compaction, dynamic cluster resizing are all fail. The developers will adamantly claim all such features work just fine. Good luck getting any of it to work in YOUR environment. In stark contrast, I am intimately familiar with at least one large HBase installation run by non-developers (at Mozilla). Disclaimers: I am very familiar with the Cassandra product internals, developers, history, and community. I am less familiar with HBase. I might therefore have a rosy view of the HBase community based on ignorance. Also, in a low-volume environment, pretty much anything works. Including Cassandra. Or anything else. Any NoSQL. Any SQL. Pick whatever you want and run with it. On Fri, Jul 30, 2010 at 9:03 PM, Otis Gospodnetic < [EMAIL PROTECTED]> wrote: > I don't have the URL handy, but just the other day I read some > Cassandra/HBase > blog post where Cassandra was described as having no SPOF, but somebody > left > some very "strong comments" calling out that and a few other claims as > false. > Ah, I remember, here is the URL: > > http://blog.mozilla.com/data/2010/05/18/riak-and-cassandra-and-hbase-oh-my/ > > > Otis---- > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Hadoop ecosystem search :: http://search-hadoop.com/ > > > > ----- Original Message ---- > > From: Jeff Zhang <[EMAIL PROTECTED]> > > To: [EMAIL PROTECTED] > > Sent: Thu, July 8, 2010 1:34:18 AM > > Subject: Re: major differences with Cassandra > > > > HBase do not have super column family. > > > > And I can list the following major difference between hbase and > cassandra ( > > welcome any supplement) : > > > > 1. HBase is master-slave architecture, while cassandra has no master, > and > > you can consider it as p2p structure, and it has no single point of > failure. > > 2. HBase is strong consistency while cassandra is eventual consistency > > (although you can tune it to be strong consistency) > > > > > > On Thu, Jul 8, 2010 at 1:26 PM, S Ahmed <[EMAIL PROTECTED]> wrote: > > > > > Hello! > > > > > > I was hoping some has experiences with both Cassandra and HBase. > > > > > > What are the major differences between Cassandra and HBase? > > > > > > Does HBase have the concept of ColumnFamilies and SuperColumnFamilies > like > > > Cassandra? > > > > > > Where in the wiki does it go over designing a data model? > > > > > > > > > thanks! > > > > > > > > > > > -- > > Best Regards > > > > Jeff Zhang > > > -- timeless(ness)
-
Re: major differences with CassandraRyan Rawson 2010-08-18, 18:17
Thanks for that bit of feedback.
Right now stumbleupon operates a cluster that handles 20,000 requests a second 24/7 for about a year now. Even though we have hbase developers I don't think there is any special sauce and anyone could replicate the successes we've had. Mozilla is one candidate. There are others who are quieter about it. On Aug 18, 2010 11:11 AM, "Time Less" <[EMAIL PROTECTED]> wrote: > HBase is run by persons who understand (or are willing to hear) the > operational requirements of distributed databases in high-volume > environments, whereas the Cassandra project isn't. > > Talks about technical differences are really noise, because they're entirely > theoretical. When viewed with this knowledge, a lot of the disagreements, > flamewars, and shoutfests begin to make sense. > > As of today, I'm unaware of any major feature Cassandra claims that it > actually delivers outside of installations run by the developers themselves. > Specifically: multi-DC, hinted handoff, compaction, dynamic cluster resizing > are all fail. The developers will adamantly claim all such features work > just fine. Good luck getting any of it to work in YOUR environment. > > In stark contrast, I am intimately familiar with at least one large HBase > installation run by non-developers (at Mozilla). > > Disclaimers: I am very familiar with the Cassandra product internals, > developers, history, and community. I am less familiar with HBase. I might > therefore have a rosy view of the HBase community based on ignorance. Also, > in a low-volume environment, pretty much anything works. Including > Cassandra. Or anything else. Any NoSQL. Any SQL. Pick whatever you want and > run with it. > > > On Fri, Jul 30, 2010 at 9:03 PM, Otis Gospodnetic < > [EMAIL PROTECTED]> wrote: > >> I don't have the URL handy, but just the other day I read some >> Cassandra/HBase >> blog post where Cassandra was described as having no SPOF, but somebody >> left >> some very "strong comments" calling out that and a few other claims as >> false. >> Ah, I remember, here is the URL: >> >> http://blog.mozilla.com/data/2010/05/18/riak-and-cassandra-and-hbase-oh-my/ >> >> >> Otis---- >> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch >> Hadoop ecosystem search :: http://search-hadoop.com/ >> >> >> >> ----- Original Message ---- >> > From: Jeff Zhang <[EMAIL PROTECTED]> >> > To: [EMAIL PROTECTED] >> > Sent: Thu, July 8, 2010 1:34:18 AM >> > Subject: Re: major differences with Cassandra >> > >> > HBase do not have super column family. >> > >> > And I can list the following major difference between hbase and >> cassandra ( >> > welcome any supplement) : >> > >> > 1. HBase is master-slave architecture, while cassandra has no master, >> and >> > you can consider it as p2p structure, and it has no single point of >> failure. >> > 2. HBase is strong consistency while cassandra is eventual consistency >> > (although you can tune it to be strong consistency) >> > >> > >> > On Thu, Jul 8, 2010 at 1:26 PM, S Ahmed <[EMAIL PROTECTED]> wrote: >> > >> > > Hello! >> > > >> > > I was hoping some has experiences with both Cassandra and HBase. >> > > >> > > What are the major differences between Cassandra and HBase? >> > > >> > > Does HBase have the concept of ColumnFamilies and SuperColumnFamilies >> like >> > > Cassandra? >> > > >> > > Where in the wiki does it go over designing a data model? >> > > >> > > >> > > thanks! >> > > >> > >> > >> > >> > -- >> > Best Regards >> > >> > Jeff Zhang >> > >> > > > > -- > timeless(ness)
-
Re: major differences with CassandraEdward Capriolo 2010-08-18, 19:23
On Wed, Aug 18, 2010 at 2:17 PM, Ryan Rawson <[EMAIL PROTECTED]> wrote:
> Thanks for that bit of feedback. > > Right now stumbleupon operates a cluster that handles 20,000 requests a > second 24/7 for about a year now. Even though we have hbase developers I > don't think there is any special sauce and anyone could replicate the > successes we've had. Mozilla is one candidate. There are others who are > quieter about it. > > On Aug 18, 2010 11:11 AM, "Time Less" <[EMAIL PROTECTED]> wrote: >> HBase is run by persons who understand (or are willing to hear) the >> operational requirements of distributed databases in high-volume >> environments, whereas the Cassandra project isn't. >> >> Talks about technical differences are really noise, because they're > entirely >> theoretical. When viewed with this knowledge, a lot of the disagreements, >> flamewars, and shoutfests begin to make sense. >> >> As of today, I'm unaware of any major feature Cassandra claims that it >> actually delivers outside of installations run by the developers > themselves. >> Specifically: multi-DC, hinted handoff, compaction, dynamic cluster > resizing >> are all fail. The developers will adamantly claim all such features work >> just fine. Good luck getting any of it to work in YOUR environment. >> >> In stark contrast, I am intimately familiar with at least one large HBase >> installation run by non-developers (at Mozilla). >> >> Disclaimers: I am very familiar with the Cassandra product internals, >> developers, history, and community. I am less familiar with HBase. I might >> therefore have a rosy view of the HBase community based on ignorance. > Also, >> in a low-volume environment, pretty much anything works. Including >> Cassandra. Or anything else. Any NoSQL. Any SQL. Pick whatever you want > and >> run with it. >> >> >> On Fri, Jul 30, 2010 at 9:03 PM, Otis Gospodnetic < >> [EMAIL PROTECTED]> wrote: >> >>> I don't have the URL handy, but just the other day I read some >>> Cassandra/HBase >>> blog post where Cassandra was described as having no SPOF, but somebody >>> left >>> some very "strong comments" calling out that and a few other claims as >>> false. >>> Ah, I remember, here is the URL: >>> >>> > http://blog.mozilla.com/data/2010/05/18/riak-and-cassandra-and-hbase-oh-my/ >>> >>> >>> Otis---- >>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch >>> Hadoop ecosystem search :: http://search-hadoop.com/ >>> >>> >>> >>> ----- Original Message ---- >>> > From: Jeff Zhang <[EMAIL PROTECTED]> >>> > To: [EMAIL PROTECTED] >>> > Sent: Thu, July 8, 2010 1:34:18 AM >>> > Subject: Re: major differences with Cassandra >>> > >>> > HBase do not have super column family. >>> > >>> > And I can list the following major difference between hbase and >>> cassandra ( >>> > welcome any supplement) : >>> > >>> > 1. HBase is master-slave architecture, while cassandra has no master, >>> and >>> > you can consider it as p2p structure, and it has no single point of >>> failure. >>> > 2. HBase is strong consistency while cassandra is eventual consistency >>> > (although you can tune it to be strong consistency) >>> > >>> > >>> > On Thu, Jul 8, 2010 at 1:26 PM, S Ahmed <[EMAIL PROTECTED]> wrote: >>> > >>> > > Hello! >>> > > >>> > > I was hoping some has experiences with both Cassandra and HBase. >>> > > >>> > > What are the major differences between Cassandra and HBase? >>> > > >>> > > Does HBase have the concept of ColumnFamilies and SuperColumnFamilies >>> like >>> > > Cassandra? >>> > > >>> > > Where in the wiki does it go over designing a data model? >>> > > >>> > > >>> > > thanks! >>> > > >>> > >>> > >>> > >>> > -- >>> > Best Regards >>> > >>> > Jeff Zhang >>> > >>> >> >> >> >> -- >> timeless(ness) > You said: As of today, I'm unaware of any major feature Cassandra claims that it actually delivers outside of installations run by the developers themselves. Specifically: multi-DC, hinted handoff, compaction, dynamic cluster resizing are all fail. The developers will adamantly claim all such features work just fine. Good luck getting any of it to work in YOUR environment. Where to start with this statement: Multi-DC support: You are saying cassandra is bad at X.... but hbase does not even do X. https://issues.apache.org/jira/browse/HBASE-1295 Hinted Handoff: If i take down a cassandra node hints get delivered to other nodes. When the failed node comes back online the hints are delivered. Compaction: Compaction works. My tables compact at user defined intervals. Dynamic Cluster Resizing: Joining a new node is more intensive in cassandra as data has to physically move from physical node to another. Yet, I regularly add, replace, and move nodes. You said: Talks about technical differences are really noise, because they're entirely theoretical. This statement is contradictory. You are saying technical differences are theoretical. Small technical differences have profound implications. You said: In stark contrast, I am intimately familiar with at least one large HBase installation run by non-developers (at Mozilla). Then later: I am very familiar with the Cassandra product internals, developers, history, and community. I am less familiar with HBase. I give up. Are you "intimately familiar" or "less familiar" ? Where can I check out the source for this "Any SQL" you mention? Sounds like it has way less problems then these damned no sql solutions. |