|
Chris Tarnas
2011-08-30, 06:02
highpointe
2011-08-30, 06:14
highpointe
2011-08-30, 06:16
Andrew Purtell
2011-08-30, 09:47
Bernd Fondermann
2011-08-30, 11:29
Sam Seigal
2011-08-30, 11:35
Ryan Rawson
2011-08-30, 15:26
Andrew Purtell
2011-08-30, 16:17
Ryan Rawson
2011-08-30, 16:21
Chris Tarnas
2011-08-30, 17:19
Joe Pallas
2011-08-30, 17:42
Sam Seigal
2011-08-30, 19:22
Joseph Boyd
2011-08-30, 20:04
Ryan Rawson
2011-08-30, 20:09
Ryan Rawson
2011-08-30, 23:55
Andrew Purtell
2011-08-31, 00:37
Andrew Purtell
2011-08-31, 01:05
Andrew Purtell
2011-08-31, 03:48
Andrew Purtell
2011-08-31, 03:52
Time Less
2011-08-31, 05:34
Time Less
2011-08-31, 05:47
Andrew Purtell
2011-08-31, 06:44
Edward Capriolo
2011-08-31, 17:51
Gary Helmling
2011-08-31, 18:21
Andrew Purtell
2011-09-01, 04:35
Edward Capriolo
2011-09-01, 17:53
Ryan Rawson
2011-09-01, 19:12
Time Less
2011-09-01, 22:13
Arun C Murthy
2011-09-01, 22:37
Andrew Purtell
2011-09-02, 00:41
Michael Segel
2011-09-02, 00:47
Andrew Purtell
2011-09-02, 02:27
Joseph Pallas
2011-09-02, 17:27
Ryan Rawson
2011-09-02, 17:47
Jacques
2011-09-02, 18:18
Time Less
2011-09-02, 20:44
Jeremy Hanna
2011-09-05, 19:41
Michael Segel
2011-09-05, 22:06
|
-
HBase and Cassandra on StackOverflowChris Tarnas 2011-08-30, 06:02
Someone with better knowledge than might be interested in helping answer this question over at StackOverflow:
http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra -chris
-
Re: HBase and Cassandra on StackOverflowhighpointe 2011-08-30, 06:14
This is rather dated. I would love to sew the side by side justification if anyone has made the transition lately.
Sent from my iPhone On Aug 30, 2011, at 12:02 AM, Chris Tarnas <[EMAIL PROTECTED]> wrote: > Someone with better knowledge than might be interested in helping answer this question over at StackOverflow: > > http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra > > -chris
-
Re: HBase and Cassandra on StackOverflowhighpointe 2011-08-30, 06:16
My bad. Was looking on the date of the link. Not the post. Please ignore.
Sent from my iPhone On Aug 30, 2011, at 12:02 AM, Chris Tarnas <[EMAIL PROTECTED]> wrote: > Someone with better knowledge than might be interested in helping answer this question over at StackOverflow: > > http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra > > -chris
-
Re: HBase and Cassandra on StackOverflowAndrew Purtell 2011-08-30, 09:47
Hi Chris,
Appreciate your answer on the post. Personally speaking however the endless Cassandra vs. HBase discussion is tiresome and rarely do blog posts or emails in this regard shed any light. Often, Cassandra proponents mis-state their case out of ignorance of HBase or due to commercial or personal agendas. It is difficult to find clear eyed analysis among the partisans. I'm not sure it will make any difference posting a rebuttal to some random thing jbellis says. Better to focus on improving HBase than play whack a mole. Regarding some of the specific points in that post: HBase is proven in production deployments larger than the largest publicly reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But basically this is the same order of magnitude, with HBase having a slight edge. I don't see a meaningful difference here. Stating otherwise is false. HBase supports replication between clusters (i.e. data centers). I believe, but admit I'm not super familiar with the Cassandra option here, that the main difference is HBase provides simple mechanism and the user must build a replication architecture useful for them; while Cassandra attempts to hide some of that complexity. I do not know if they succeed there, but large scale cross data center replication is rarely one size fits all so I doubt it. Cassandra does not have strong consistency in the sense that HBase provides. It can provide strong consistency, but at the cost of failing any read if there is insufficient quorum. HBase/HDFS does not have that limitation. On the other hand, HBase has its own and different scenarios where data may not be immediately available. The differences between the systems are nuanced and which to use depends on the use case requirements. Cassandra's RandomPartitioner / hash based partitioning means efficient MapReduce or table scanning is not possible, whereas HBase's distributed ordered tree is naturally efficient for such use cases, I believe explaining why Hadoop users often prefer it. This may or may not be a problem for any given use case. Using an ordered partitioner with Cassandra used to require frequent manual rebalancing to avoid blowing up nodes. I don't know if more recent versions still have this mis-feature. Cassandra is no less complex than HBase. All of this complexity is "hidden" in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase, etc. -- but the Cassandra internals are no less layered. An impartial analysis of implementation and algorithms will reveal that Cassandra's theory of operation in its full detail is substantially more complex. Compare the BigTable and Dynamo papers and this is clear. There are actually more opportunities for something to go wrong with Cassandra. While we are looking at codebases, it should be noted that HBase has substantially more unit tests. With Cassandra, all RPC is via Thrift with various wrappers, so actually all Cassandra clients are second class in the sense that jbellis means when he states "Non-Java clients are not second-class citizens". The master-slave versus peer-to-peer argument is larger than Cassandra vs. HBase, and not nearly as one sided as claimed. The famous (infamous?) global failure of Amazon's S3 in 2008, a fully peer-to-peer system, due to a single flipped bit in a gossip message demonstrates how in peer to peer systems every node can be a single point of failure. There is no obvious winner, instead, a series of trade offs. Claiming otherwise is intellectually dishonest. Master-slave architectures seem easier to operate and reason about in my experience. Of course, I'm partial there. I have just scratched the surface. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) >________________________________ >From: Chris Tarnas <[EMAIL PROTECTED]> >To: [EMAIL PROTECTED] >Sent: Tuesday, August 30, 2011 2:02 PM >Subject: HBase and Cassandra on StackOverflow >
-
Re: HBase and Cassandra on StackOverflowBernd Fondermann 2011-08-30, 11:29
On Tue, Aug 30, 2011 at 11:47, Andrew Purtell <[EMAIL PROTECTED]> wrote:
> Hi Chris, > > Appreciate your answer on the post. > > Personally speaking however the endless Cassandra vs. HBase discussion is tiresome and rarely do blog posts or emails in this regard shed any light. Often, Cassandra proponents mis-state their case out of ignorance of HBase or due to commercial or personal agendas. It is difficult to find clear eyed analysis among the partisans. I'm not sure it will make any difference posting a rebuttal to some random thing jbellis says. Better to focus on improving HBase than play whack a mole. > > > Regarding some of the specific points in that post: > > HBase is proven in production deployments larger than the largest publicly reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But basically this is the same order of magnitude, with HBase having a slight edge. I don't see a meaningful difference here. Stating otherwise is false. > > HBase supports replication between clusters (i.e. data centers). I believe, but admit I'm not super familiar with the Cassandra option here, that the main difference is HBase provides simple mechanism and the user must build a replication architecture useful for them; while Cassandra attempts to hide some of that complexity. I do not know if they succeed there, but large scale cross data center replication is rarely one size fits all so I doubt it. > > Cassandra does not have strong consistency in the sense that HBase provides. It can provide strong consistency, but at the cost of failing any read if there is insufficient quorum. HBase/HDFS does not have that limitation. On the other hand, HBase has its own and different scenarios where data may not be immediately available. The differences between the systems are nuanced and which to use depends on the use case requirements. > > Cassandra's RandomPartitioner / hash based partitioning means efficient MapReduce or table scanning is not possible, whereas HBase's distributed ordered tree is naturally efficient for such use cases, I believe explaining why Hadoop users often prefer it. This may or may not be a problem for any given use case. Using an ordered partitioner with Cassandra used to require frequent manual rebalancing to avoid blowing up nodes. I don't know if more recent versions still have this mis-feature. > > Cassandra is no less complex than HBase. All of this complexity is "hidden" in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase, etc. -- but the Cassandra internals are no less layered. An impartial analysis of implementation and algorithms will reveal that Cassandra's theory of operation in its full detail is substantially more complex. Compare the BigTable and Dynamo papers and this is clear. There are actually more opportunities for something to go wrong with Cassandra. > > While we are looking at codebases, it should be noted that HBase has substantially more unit tests. > > With Cassandra, all RPC is via Thrift with various wrappers, so actually all Cassandra clients are second class in the sense that jbellis means when he states "Non-Java clients are not second-class citizens". > > The master-slave versus peer-to-peer argument is larger than Cassandra vs. HBase, and not nearly as one sided as claimed. The famous (infamous?) global failure of Amazon's S3 in 2008, a fully peer-to-peer system, due to a single flipped bit in a gossip message demonstrates how in peer to peer systems every node can be a single point of failure. There is no obvious winner, instead, a series of trade offs. Claiming otherwise is intellectually dishonest. Master-slave architectures seem easier to operate and reason about in my experience. Of course, I'm partial there. > > I have just scratched the surface. +1, insightful. Thanks for posting this. Bernd
-
Re: HBase and Cassandra on StackOverflowSam Seigal 2011-08-30, 11:35
A question inline:
On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote: > Hi Chris, > > Appreciate your answer on the post. > > Personally speaking however the endless Cassandra vs. HBase discussion is > tiresome and rarely do blog posts or emails in this regard shed any light. > Often, Cassandra proponents mis-state their case out of ignorance of HBase > or due to commercial or personal agendas. It is difficult to find clear eyed > analysis among the partisans. I'm not sure it will make any difference > posting a rebuttal to some random thing jbellis says. Better to focus on > improving HBase than play whack a mole. > > > Regarding some of the specific points in that post: > > HBase is proven in production deployments larger than the largest publicly > reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But basically > this is the same order of magnitude, with HBase having a slight edge. I > don't see a meaningful difference here. Stating otherwise is false. > > HBase supports replication between clusters (i.e. data centers). I believe, > but admit I'm not super familiar with the Cassandra option here, that the > main difference is HBase provides simple mechanism and the user must build a > replication architecture useful for them; while Cassandra attempts to hide > some of that complexity. I do not know if they succeed there, but large > scale cross data center replication is rarely one size fits all so I doubt > it. > > Cassandra does not have strong consistency in the sense that HBase > provides. It can provide strong consistency, but at the cost of failing any > read if there is insufficient quorum. HBase/HDFS does not have that > limitation. On the other hand, HBase has its own and different scenarios > where data may not be immediately available. The differences between the > systems are nuanced and which to use depends on the use case requirements. > > I have a question regarding this point. Is the replication strategy for HBase completely reliant on HDFS' block replication pipelining ? Is this replication process asynchronous ? If it is, then is there not a window, where when a machine is to die and the replication pipeline for a particular block has not started yet, that block will be unavailable until the machine comes back up ? Sorry, if I am missing something important here. > Cassandra's RandomPartitioner / hash based partitioning means efficient > MapReduce or table scanning is not possible, whereas HBase's distributed > ordered tree is naturally efficient for such use cases, I believe explaining > why Hadoop users often prefer it. This may or may not be a problem for any > given use case. Using an ordered partitioner with Cassandra used to require > frequent manual rebalancing to avoid blowing up nodes. I don't know if more > recent versions still have this mis-feature. > > Cassandra is no less complex than HBase. All of this complexity is "hidden" > in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase, > etc. -- but the Cassandra internals are no less layered. An impartial > analysis of implementation and algorithms will reveal that Cassandra's > theory of operation in its full detail is substantially more complex. > Compare the BigTable and Dynamo papers and this is clear. There are actually > more opportunities for something to go wrong with Cassandra. > > While we are looking at codebases, it should be noted that HBase has > substantially more unit tests. > > With Cassandra, all RPC is via Thrift with various wrappers, so actually > all Cassandra clients are second class in the sense that jbellis means when > he states "Non-Java clients are not second-class citizens". > > The master-slave versus peer-to-peer argument is larger than Cassandra vs. > HBase, and not nearly as one sided as claimed. The famous (infamous?) global > failure of Amazon's S3 in 2008, a fully peer-to-peer system, due to a single > flipped bit in a gossip message demonstrates how in peer to peer systems
-
Re: HBase and Cassandra on StackOverflowRyan Rawson 2011-08-30, 15:26
The Hdfs write pipeline is synchronous, so there is no window.
On Aug 30, 2011 4:35 AM, "Sam Seigal" <[EMAIL PROTECTED]> wrote: > A question inline: > > On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote: > >> Hi Chris, >> >> Appreciate your answer on the post. >> >> Personally speaking however the endless Cassandra vs. HBase discussion is >> tiresome and rarely do blog posts or emails in this regard shed any light. >> Often, Cassandra proponents mis-state their case out of ignorance of HBase >> or due to commercial or personal agendas. It is difficult to find clear eyed >> analysis among the partisans. I'm not sure it will make any difference >> posting a rebuttal to some random thing jbellis says. Better to focus on >> improving HBase than play whack a mole. >> >> >> Regarding some of the specific points in that post: >> >> HBase is proven in production deployments larger than the largest publicly >> reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But basically >> this is the same order of magnitude, with HBase having a slight edge. I >> don't see a meaningful difference here. Stating otherwise is false. >> >> HBase supports replication between clusters (i.e. data centers). I believe, >> but admit I'm not super familiar with the Cassandra option here, that the >> main difference is HBase provides simple mechanism and the user must build a >> replication architecture useful for them; while Cassandra attempts to hide >> some of that complexity. I do not know if they succeed there, but large >> scale cross data center replication is rarely one size fits all so I doubt >> it. >> >> Cassandra does not have strong consistency in the sense that HBase >> provides. It can provide strong consistency, but at the cost of failing any >> read if there is insufficient quorum. HBase/HDFS does not have that >> limitation. On the other hand, HBase has its own and different scenarios >> where data may not be immediately available. The differences between the >> systems are nuanced and which to use depends on the use case requirements. >> >> > I have a question regarding this point. Is the replication strategy for > HBase completely reliant on HDFS' block replication pipelining ? Is this > replication process asynchronous ? If it is, then is there not a window, > where when a machine is to die and the replication pipeline for a particular > block has not started yet, that block will be unavailable until the machine > comes back up ? Sorry, if I am missing something important here. > > >> Cassandra's RandomPartitioner / hash based partitioning means efficient >> MapReduce or table scanning is not possible, whereas HBase's distributed >> ordered tree is naturally efficient for such use cases, I believe explaining >> why Hadoop users often prefer it. This may or may not be a problem for any >> given use case. Using an ordered partitioner with Cassandra used to require >> frequent manual rebalancing to avoid blowing up nodes. I don't know if more >> recent versions still have this mis-feature. >> >> Cassandra is no less complex than HBase. All of this complexity is "hidden" >> in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase, >> etc. -- but the Cassandra internals are no less layered. An impartial >> analysis of implementation and algorithms will reveal that Cassandra's >> theory of operation in its full detail is substantially more complex. >> Compare the BigTable and Dynamo papers and this is clear. There are actually >> more opportunities for something to go wrong with Cassandra. >> >> While we are looking at codebases, it should be noted that HBase has >> substantially more unit tests. >> >> With Cassandra, all RPC is via Thrift with various wrappers, so actually >> all Cassandra clients are second class in the sense that jbellis means when >> he states "Non-Java clients are not second-class citizens". >> >> The master-slave versus peer-to-peer argument is larger than Cassandra vs. global single http://stackoverflow.com/questions/7237271/large-scale-data-processing-hbase-cassandra
-
Re: HBase and Cassandra on StackOverflowAndrew Purtell 2011-08-30, 16:17
> Is the replication strategy for HBase completely reliant on HDFS' block
> replication pipelining ? Yes. > Is this replication process asynchronous ? No. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) >________________________________ >From: Sam Seigal <[EMAIL PROTECTED]> >To: [EMAIL PROTECTED]; Andrew Purtell <[EMAIL PROTECTED]> >Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> >Sent: Tuesday, August 30, 2011 7:35 PM >Subject: Re: HBase and Cassandra on StackOverflow > >A question inline: > >On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote: > >> Hi Chris, >> >> Appreciate your answer on the post. >> >> Personally speaking however the endless Cassandra vs. HBase discussion is >> tiresome and rarely do blog posts or emails in this regard shed any light. >> Often, Cassandra proponents mis-state their case out of ignorance of HBase >> or due to commercial or personal agendas. It is difficult to find clear eyed >> analysis among the partisans. I'm not sure it will make any difference >> posting a rebuttal to some random thing jbellis says. Better to focus on >> improving HBase than play whack a mole. >> >> >> Regarding some of the specific points in that post: >> >> HBase is proven in production deployments larger than the largest publicly >> reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But basically >> this is the same order of magnitude, with HBase having a slight edge. I >> don't see a meaningful difference here. Stating otherwise is false. >> >> HBase supports replication between clusters (i.e. data centers). I believe, >> but admit I'm not super familiar with the Cassandra option here, that the >> main difference is HBase provides simple mechanism and the user must build a >> replication architecture useful for them; while Cassandra attempts to hide >> some of that complexity. I do not know if they succeed there, but large >> scale cross data center replication is rarely one size fits all so I doubt >> it. >> >> Cassandra does not have strong consistency in the sense that HBase >> provides. It can provide strong consistency, but at the cost of failing any >> read if there is insufficient quorum. HBase/HDFS does not have that >> limitation. On the other hand, HBase has its own and different scenarios >> where data may not be immediately available. The differences between the >> systems are nuanced and which to use depends on the use case requirements. >> >> >I have a question regarding this point. Is the replication strategy for >HBase completely reliant on HDFS' block replication pipelining ? Is this >replication process asynchronous ? If it is, then is there not a window, >where when a machine is to die and the replication pipeline for a particular >block has not started yet, that block will be unavailable until the machine >comes back up ? Sorry, if I am missing something important here. > > >> Cassandra's RandomPartitioner / hash based partitioning means efficient >> MapReduce or table scanning is not possible, whereas HBase's distributed >> ordered tree is naturally efficient for such use cases, I believe explaining >> why Hadoop users often prefer it. This may or may not be a problem for any >> given use case. Using an ordered partitioner with Cassandra used to require >> frequent manual rebalancing to avoid blowing up nodes. I don't know if more >> recent versions still have this mis-feature. >> >> Cassandra is no less complex than HBase. All of this complexity is "hidden" >> in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase, >> etc. -- but the Cassandra internals are no less layered. An impartial >> analysis of implementation and algorithms will reveal that Cassandra's >> theory of operation in its full detail is substantially more complex. >> Compare the BigTable and Dynamo papers and this is clear. There are actually >> more opportunities for something to go wrong with Cassandra.
-
Re: HBase and Cassandra on StackOverflowRyan Rawson 2011-08-30, 16:21
I really like the theory of operation stuff. People say that
centralized operation is a flaw, but I say it's a strength. In a single datacenter, you have extremely fast .1ms ping or less, there is no need for a fully decentralized architecture - it can be really hard to debug. -ryan On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote: > Hi Chris, > > Appreciate your answer on the post. > > Personally speaking however the endless Cassandra vs. HBase discussion is tiresome and rarely do blog posts or emails in this regard shed any light. Often, Cassandra proponents mis-state their case out of ignorance of HBase or due to commercial or personal agendas. It is difficult to find clear eyed analysis among the partisans. I'm not sure it will make any difference posting a rebuttal to some random thing jbellis says. Better to focus on improving HBase than play whack a mole. > > > Regarding some of the specific points in that post: > > HBase is proven in production deployments larger than the largest publicly reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But basically this is the same order of magnitude, with HBase having a slight edge. I don't see a meaningful difference here. Stating otherwise is false. > > HBase supports replication between clusters (i.e. data centers). I believe, but admit I'm not super familiar with the Cassandra option here, that the main difference is HBase provides simple mechanism and the user must build a replication architecture useful for them; while Cassandra attempts to hide some of that complexity. I do not know if they succeed there, but large scale cross data center replication is rarely one size fits all so I doubt it. > > Cassandra does not have strong consistency in the sense that HBase provides. It can provide strong consistency, but at the cost of failing any read if there is insufficient quorum. HBase/HDFS does not have that limitation. On the other hand, HBase has its own and different scenarios where data may not be immediately available. The differences between the systems are nuanced and which to use depends on the use case requirements. > > Cassandra's RandomPartitioner / hash based partitioning means efficient MapReduce or table scanning is not possible, whereas HBase's distributed ordered tree is naturally efficient for such use cases, I believe explaining why Hadoop users often prefer it. This may or may not be a problem for any given use case. Using an ordered partitioner with Cassandra used to require frequent manual rebalancing to avoid blowing up nodes. I don't know if more recent versions still have this mis-feature. > > Cassandra is no less complex than HBase. All of this complexity is "hidden" in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase, etc. -- but the Cassandra internals are no less layered. An impartial analysis of implementation and algorithms will reveal that Cassandra's theory of operation in its full detail is substantially more complex. Compare the BigTable and Dynamo papers and this is clear. There are actually more opportunities for something to go wrong with Cassandra. > > While we are looking at codebases, it should be noted that HBase has substantially more unit tests. > > With Cassandra, all RPC is via Thrift with various wrappers, so actually all Cassandra clients are second class in the sense that jbellis means when he states "Non-Java clients are not second-class citizens". > > The master-slave versus peer-to-peer argument is larger than Cassandra vs. HBase, and not nearly as one sided as claimed. The famous (infamous?) global failure of Amazon's S3 in 2008, a fully peer-to-peer system, due to a single flipped bit in a gossip message demonstrates how in peer to peer systems every node can be a single point of failure. There is no obvious winner, instead, a series of trade offs. Claiming otherwise is intellectually dishonest. Master-slave architectures seem easier to operate and reason about in my experience. Of course, I'm partial there.
-
Re: HBase and Cassandra on StackOverflowChris Tarnas 2011-08-30, 17:19
Hi Andrew,
Would you mind if I paraphrase your responses on StackOverflow? -chris On Aug 30, 2011, at 2:47 AM, Andrew Purtell wrote: > Hi Chris, > > Appreciate your answer on the post. > > Personally speaking however the endless Cassandra vs. HBase discussion is tiresome and rarely do blog posts or emails in this regard shed any light. Often, Cassandra proponents mis-state their case out of ignorance of HBase or due to commercial or personal agendas. It is difficult to find clear eyed analysis among the partisans. I'm not sure it will make any difference posting a rebuttal to some random thing jbellis says. Better to focus on improving HBase than play whack a mole. > > > Regarding some of the specific points in that post: > > HBase is proven in production deployments larger than the largest publicly reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But basically this is the same order of magnitude, with HBase having a slight edge. I don't see a meaningful difference here. Stating otherwise is false. > > HBase supports replication between clusters (i.e. data centers). I believe, but admit I'm not super familiar with the Cassandra option here, that the main difference is HBase provides simple mechanism and the user must build a replication architecture useful for them; while Cassandra attempts to hide some of that complexity. I do not know if they succeed there, but large scale cross data center replication is rarely one size fits all so I doubt it. > > Cassandra does not have strong consistency in the sense that HBase provides. It can provide strong consistency, but at the cost of failing any read if there is insufficient quorum. HBase/HDFS does not have that limitation. On the other hand, HBase has its own and different scenarios where data may not be immediately available. The differences between the systems are nuanced and which to use depends on the use case requirements. > > Cassandra's RandomPartitioner / hash based partitioning means efficient MapReduce or table scanning is not possible, whereas HBase's distributed ordered tree is naturally efficient for such use cases, I believe explaining why Hadoop users often prefer it. This may or may not be a problem for any given use case. Using an ordered partitioner with Cassandra used to require frequent manual rebalancing to avoid blowing up nodes. I don't know if more recent versions still have this mis-feature. > > Cassandra is no less complex than HBase. All of this complexity is "hidden" in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase, etc. -- but the Cassandra internals are no less layered. An impartial analysis of implementation and algorithms will reveal that Cassandra's theory of operation in its full detail is substantially more complex. Compare the BigTable and Dynamo papers and this is clear. There are actually more opportunities for something to go wrong with Cassandra. > > While we are looking at codebases, it should be noted that HBase has substantially more unit tests. > > With Cassandra, all RPC is via Thrift with various wrappers, so actually all Cassandra clients are second class in the sense that jbellis means when he states "Non-Java clients are not second-class citizens". > > The master-slave versus peer-to-peer argument is larger than Cassandra vs. HBase, and not nearly as one sided as claimed. The famous (infamous?) global failure of Amazon's S3 in 2008, a fully peer-to-peer system, due to a single flipped bit in a gossip message demonstrates how in peer to peer systems every node can be a single point of failure. There is no obvious winner, instead, a series of trade offs. Claiming otherwise is intellectually dishonest. Master-slave architectures seem easier to operate and reason about in my experience. Of course, I'm partial there. > > I have just scratched the surface. > > > Best regards, > > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
-
Re: HBase and Cassandra on StackOverflowJoe Pallas 2011-08-30, 17:42
On Aug 30, 2011, at 2:47 AM, Andrew Purtell wrote: > Better to focus on improving HBase than play whack a mole. Absolutely. So let's talk about improving HBase. I'm speaking here as someone who has been learning about and experimenting with HBase for more than six months. > HBase supports replication between clusters (i.e. data centers). That’s … debatable. There's replication support in the code, but several times in the recent past when someone asked about it on this mailing list, the response was “I don't know of anyone actually using it.” My understanding of replication is that you can't replicate any existing data, so unless you activated it on day one, it isn't very useful. Do I misunderstand? > Cassandra does not have strong consistency in the sense that HBase provides. It can provide strong consistency, but at the cost of failing any read if there is insufficient quorum. HBase/HDFS does not have that limitation. On the other hand, HBase has its own and different scenarios where data may not be immediately available. The differences between the systems are nuanced and which to use depends on the use case requirements. That's fair enough, although I think your first two sentences nearly contradict each other :-). If you use N=3, W=3, R=1 in Cassandra, you should get similar behavior to HBase/HDFS with respect to consistency and availability ("strong" consistency and reads do not fail if any one copy is available). A more important point, I think, is the one about storage. HBase uses two different kinds of files, data files and logs, but HDFS doesn't know about that and cannot, for example, optimize data files for write throughput (and random reads) and log files for low latency sequential writes. (For example, how could performance be improved by adding solid-state disk?) > Cassandra's RandomPartitioner / hash based partitioning means efficient MapReduce or table scanning is not possible, whereas HBase's distributed ordered tree is naturally efficient for such use cases, I believe explaining why Hadoop users often prefer it. This may or may not be a problem for any given use case. I don't think you can make a blanket statement that random partitioning makes efficient MapReduce impossible (scanning, yes). Many M/R tasks process entire tables. Random partitioning has definite advantages for some cases, and HBase might well benefit from recognizing that and adding some support. > Cassandra is no less complex than HBase. All of this complexity is "hidden" in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase, etc. -- but the Cassandra internals are no less layered. Operationally, however, HBase is more complex. Admins have to configure and manage ZooKeeper, HDFS, and HBase. Could this be improved? > With Cassandra, all RPC is via Thrift with various wrappers, so actually all Cassandra clients are second class in the sense that jbellis means when he states "Non-Java clients are not second-class citizens". That's disingenuous. Thrift exposes all of the Cassandra API to all of the wrappers, while HBase clients who want to use all of the HBase API must use Java. That can be fixed, but it is the status quo. joe
-
Re: HBase and Cassandra on StackOverflowSam Seigal 2011-08-30, 19:22
Will the write call to HBase block until the record written is fully
replicated ? If not (since it is happening at the block level), then isn't there a window where a region server goes down, the data might not be available anywhere else, until it comes back up ? On Tue, Aug 30, 2011 at 9:17 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote: > > Is the replication strategy for HBase completely reliant on HDFS' block > > replication pipelining ? > > Yes. > > > Is this replication process asynchronous ? > > > No. > Best regards, > > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) > > > >________________________________ > >From: Sam Seigal <[EMAIL PROTECTED]> > >To: [EMAIL PROTECTED]; Andrew Purtell <[EMAIL PROTECTED]> > >Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > >Sent: Tuesday, August 30, 2011 7:35 PM > >Subject: Re: HBase and Cassandra on StackOverflow > > > >A question inline: > > > >On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <[EMAIL PROTECTED]> > wrote: > > > >> Hi Chris, > >> > >> Appreciate your answer on the post. > >> > >> Personally speaking however the endless Cassandra vs. HBase discussion > is > >> tiresome and rarely do blog posts or emails in this regard shed any > light. > >> Often, Cassandra proponents mis-state their case out of ignorance of > HBase > >> or due to commercial or personal agendas. It is difficult to find clear > eyed > >> analysis among the partisans. I'm not sure it will make any difference > >> posting a rebuttal to some random thing jbellis says. Better to focus on > >> improving HBase than play whack a mole. > >> > >> > >> Regarding some of the specific points in that post: > >> > >> HBase is proven in production deployments larger than the largest > publicly > >> reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But > basically > >> this is the same order of magnitude, with HBase having a slight edge. I > >> don't see a meaningful difference here. Stating otherwise is false. > >> > >> HBase supports replication between clusters (i.e. data centers). I > believe, > >> but admit I'm not super familiar with the Cassandra option here, that > the > >> main difference is HBase provides simple mechanism and the user must > build a > >> replication architecture useful for them; while Cassandra attempts to > hide > >> some of that complexity. I do not know if they succeed there, but large > >> scale cross data center replication is rarely one size fits all so I > doubt > >> it. > >> > >> Cassandra does not have strong consistency in the sense that HBase > >> provides. It can provide strong consistency, but at the cost of failing > any > >> read if there is insufficient quorum. HBase/HDFS does not have that > >> limitation. On the other hand, HBase has its own and different scenarios > >> where data may not be immediately available. The differences between the > >> systems are nuanced and which to use depends on the use case > requirements. > >> > >> > >I have a question regarding this point. Is the replication strategy for > >HBase completely reliant on HDFS' block replication pipelining ? Is this > >replication process asynchronous ? If it is, then is there not a window, > >where when a machine is to die and the replication pipeline for a > particular > >block has not started yet, that block will be unavailable until the > machine > >comes back up ? Sorry, if I am missing something important here. > > > > > >> Cassandra's RandomPartitioner / hash based partitioning means efficient > >> MapReduce or table scanning is not possible, whereas HBase's distributed > >> ordered tree is naturally efficient for such use cases, I believe > explaining > >> why Hadoop users often prefer it. This may or may not be a problem for > any > >> given use case. Using an ordered partitioner with Cassandra used to > require > >> frequent manual rebalancing to avoid blowing up nodes. I don't know if > more > >> recent versions still have this mis-feature.
-
Re: HBase and Cassandra on StackOverflowJoseph Boyd 2011-08-30, 20:04
On Tue, Aug 30, 2011 at 12:22 PM, Sam Seigal <[EMAIL PROTECTED]> wrote:
> > Will the write call to HBase block until the record written is fully > replicated ? no. data isn't written to disk immediately > If not (since it is happening at the block level), then isn't > there a window where a region server goes down, the data might not be > available anywhere else, until it comes back up ? the data would be in the write ahead log. ...joe > On Tue, Aug 30, 2011 at 9:17 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote: > > > > Is the replication strategy for HBase completely reliant on HDFS' block > > > replication pipelining ? > > > > Yes. > > > > > Is this replication process asynchronous ? > > > > > > No. > > Best regards, > > > > > > - Andy > > > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > > (via Tom White) > > > > > > >________________________________ > > >From: Sam Seigal <[EMAIL PROTECTED]> > > >To: [EMAIL PROTECTED]; Andrew Purtell <[EMAIL PROTECTED]> > > >Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > > >Sent: Tuesday, August 30, 2011 7:35 PM > > >Subject: Re: HBase and Cassandra on StackOverflow > > > > > >A question inline: > > > > > >On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <[EMAIL PROTECTED]> > > wrote: > > > > > >> Hi Chris, > > >> > > >> Appreciate your answer on the post. > > >> > > >> Personally speaking however the endless Cassandra vs. HBase discussion > > is > > >> tiresome and rarely do blog posts or emails in this regard shed any > > light. > > >> Often, Cassandra proponents mis-state their case out of ignorance of > > HBase > > >> or due to commercial or personal agendas. It is difficult to find clear > > eyed > > >> analysis among the partisans. I'm not sure it will make any difference > > >> posting a rebuttal to some random thing jbellis says. Better to focus on > > >> improving HBase than play whack a mole. > > >> > > >> > > >> Regarding some of the specific points in that post: > > >> > > >> HBase is proven in production deployments larger than the largest > > publicly > > >> reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But > > basically > > >> this is the same order of magnitude, with HBase having a slight edge. I > > >> don't see a meaningful difference here. Stating otherwise is false. > > >> > > >> HBase supports replication between clusters (i.e. data centers). I > > believe, > > >> but admit I'm not super familiar with the Cassandra option here, that > > the > > >> main difference is HBase provides simple mechanism and the user must > > build a > > >> replication architecture useful for them; while Cassandra attempts to > > hide > > >> some of that complexity. I do not know if they succeed there, but large > > >> scale cross data center replication is rarely one size fits all so I > > doubt > > >> it. > > >> > > >> Cassandra does not have strong consistency in the sense that HBase > > >> provides. It can provide strong consistency, but at the cost of failing > > any > > >> read if there is insufficient quorum. HBase/HDFS does not have that > > >> limitation. On the other hand, HBase has its own and different scenarios > > >> where data may not be immediately available. The differences between the > > >> systems are nuanced and which to use depends on the use case > > requirements. > > >> > > >> > > >I have a question regarding this point. Is the replication strategy for > > >HBase completely reliant on HDFS' block replication pipelining ? Is this > > >replication process asynchronous ? If it is, then is there not a window, > > >where when a machine is to die and the replication pipeline for a > > particular > > >block has not started yet, that block will be unavailable until the > > machine > > >comes back up ? Sorry, if I am missing something important here. > > > > > > > > >> Cassandra's RandomPartitioner / hash based partitioning means efficient > > >> MapReduce or table scanning is not possible, whereas HBase's distributed
-
Re: HBase and Cassandra on StackOverflowRyan Rawson 2011-08-30, 20:09
While data is not fsynced to disk immediately, it is acked by 3
different nodes (Assuming r=3) before HBase acks the client. -ryan On Tue, Aug 30, 2011 at 1:04 PM, Joseph Boyd <[EMAIL PROTECTED]> wrote: > On Tue, Aug 30, 2011 at 12:22 PM, Sam Seigal <[EMAIL PROTECTED]> wrote: >> >> Will the write call to HBase block until the record written is fully >> replicated ? > > no. data isn't written to disk immediately > >> If not (since it is happening at the block level), then isn't >> there a window where a region server goes down, the data might not be >> available anywhere else, until it comes back up ? > > the data would be in the write ahead log. > > > ...joe > > >> On Tue, Aug 30, 2011 at 9:17 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote: >> >> > > Is the replication strategy for HBase completely reliant on HDFS' block >> > > replication pipelining ? >> > >> > Yes. >> > >> > > Is this replication process asynchronous ? >> > >> > >> > No. >> > Best regards, >> > >> > >> > - Andy >> > >> > Problems worthy of attack prove their worth by hitting back. - Piet Hein >> > (via Tom White) >> > >> > >> > >________________________________ >> > >From: Sam Seigal <[EMAIL PROTECTED]> >> > >To: [EMAIL PROTECTED]; Andrew Purtell <[EMAIL PROTECTED]> >> > >Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> >> > >Sent: Tuesday, August 30, 2011 7:35 PM >> > >Subject: Re: HBase and Cassandra on StackOverflow >> > > >> > >A question inline: >> > > >> > >On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <[EMAIL PROTECTED]> >> > wrote: >> > > >> > >> Hi Chris, >> > >> >> > >> Appreciate your answer on the post. >> > >> >> > >> Personally speaking however the endless Cassandra vs. HBase discussion >> > is >> > >> tiresome and rarely do blog posts or emails in this regard shed any >> > light. >> > >> Often, Cassandra proponents mis-state their case out of ignorance of >> > HBase >> > >> or due to commercial or personal agendas. It is difficult to find clear >> > eyed >> > >> analysis among the partisans. I'm not sure it will make any difference >> > >> posting a rebuttal to some random thing jbellis says. Better to focus on >> > >> improving HBase than play whack a mole. >> > >> >> > >> >> > >> Regarding some of the specific points in that post: >> > >> >> > >> HBase is proven in production deployments larger than the largest >> > publicly >> > >> reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But >> > basically >> > >> this is the same order of magnitude, with HBase having a slight edge. I >> > >> don't see a meaningful difference here. Stating otherwise is false. >> > >> >> > >> HBase supports replication between clusters (i.e. data centers). I >> > believe, >> > >> but admit I'm not super familiar with the Cassandra option here, that >> > the >> > >> main difference is HBase provides simple mechanism and the user must >> > build a >> > >> replication architecture useful for them; while Cassandra attempts to >> > hide >> > >> some of that complexity. I do not know if they succeed there, but large >> > >> scale cross data center replication is rarely one size fits all so I >> > doubt >> > >> it. >> > >> >> > >> Cassandra does not have strong consistency in the sense that HBase >> > >> provides. It can provide strong consistency, but at the cost of failing >> > any >> > >> read if there is insufficient quorum. HBase/HDFS does not have that >> > >> limitation. On the other hand, HBase has its own and different scenarios >> > >> where data may not be immediately available. The differences between the >> > >> systems are nuanced and which to use depends on the use case >> > requirements. >> > >> >> > >> >> > >I have a question regarding this point. Is the replication strategy for >> > >HBase completely reliant on HDFS' block replication pipelining ? Is this >> > >replication process asynchronous ? If it is, then is there not a window, >> > >where when a machine is to die and the replication pipeline for a
-
Re: HBase and Cassandra on StackOverflowRyan Rawson 2011-08-30, 23:55
On Tue, Aug 30, 2011 at 10:42 AM, Joe Pallas <[EMAIL PROTECTED]> wrote:
> > On Aug 30, 2011, at 2:47 AM, Andrew Purtell wrote: > >> Better to focus on improving HBase than play whack a mole. > > Absolutely. So let's talk about improving HBase. I'm speaking here as someone who has been learning about and experimenting with HBase for more than six months. > >> HBase supports replication between clusters (i.e. data centers). > > That’s … debatable. There's replication support in the code, but several times in the recent past when someone asked about it on this mailing list, the response was “I don't know of anyone actually using it.” My understanding of replication is that you can't replicate any existing data, so unless you activated it on day one, it isn't very useful. Do I misunderstand? > >> Cassandra does not have strong consistency in the sense that HBase provides. It can provide strong consistency, but at the cost of failing any read if there is insufficient quorum. HBase/HDFS does not have that limitation. On the other hand, HBase has its own and different scenarios where data may not be immediately available. The differences between the systems are nuanced and which to use depends on the use case requirements. > > That's fair enough, although I think your first two sentences nearly contradict each other :-). If you use N=3, W=3, R=1 in Cassandra, you should get similar behavior to HBase/HDFS with respect to consistency and availability ("strong" consistency and reads do not fail if any one copy is available). This is on the surface true, but there are a few hbase use cases that cass has a harder time supporting: - increment counter - CAS calls some people find these essential to building systems > > A more important point, I think, is the one about storage. HBase uses two different kinds of files, data files and logs, but HDFS doesn't know about that and cannot, for example, optimize data files for write throughput (and random reads) and log files for low latency sequential writes. (For example, how could performance be improved by adding solid-state disk?) I think "HDFS doesnt know about that and cannot... optimize" is a bit of an overstatement... While it is TRUE that currently HDFS does not do anything, there is no reason why it could do something better. Adding SSD in an intelligent way would be nice. Probably not for logs though. Will HDFS ever focus on these things? Probably in the mid-term, I'm guessing we'll start to see attention on this towards the end of 2012, or possibly not at all (after all these things dont help MapReduce, so why bother?) If an alternate DFS was able to work on these issues, they could very quickly differentiate themselves over HDFS in terms of HBase support. > >> Cassandra's RandomPartitioner / hash based partitioning means efficient MapReduce or table scanning is not possible, whereas HBase's distributed ordered tree is naturally efficient for such use cases, I believe explaining why Hadoop users often prefer it. This may or may not be a problem for any given use case. > > I don't think you can make a blanket statement that random partitioning makes efficient MapReduce impossible (scanning, yes). Many M/R tasks process entire tables. Random partitioning has definite advantages for some cases, and HBase might well benefit from recognizing that and adding some support. > >> Cassandra is no less complex than HBase. All of this complexity is "hidden" in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase, etc. -- but the Cassandra internals are no less layered. > > Operationally, however, HBase is more complex. Admins have to configure and manage ZooKeeper, HDFS, and HBase. Could this be improved? > >> With Cassandra, all RPC is via Thrift with various wrappers, so actually all Cassandra clients are second class in the sense that jbellis means when he states "Non-Java clients are not second-class citizens". > > That's disingenuous. Thrift exposes all of the Cassandra API to all of the wrappers, while HBase clients who want to use all of the HBase API must use Java. That can be fixed, but it is the status quo.
-
Re: HBase and Cassandra on StackOverflowAndrew Purtell 2011-08-31, 00:37
Hi Chris,
> Would you mind if I paraphrase your responses on StackOverflow? Go right ahead. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) >________________________________ >From: Chris Tarnas <[EMAIL PROTECTED]> >To: Andrew Purtell <[EMAIL PROTECTED]>; [EMAIL PROTECTED] >Sent: Wednesday, August 31, 2011 1:19 AM >Subject: Re: HBase and Cassandra on StackOverflow > >Hi Andrew, > >Would you mind if I paraphrase your responses on StackOverflow? > >-chris >
-
Re: HBase and Cassandra on StackOverflowAndrew Purtell 2011-08-31, 01:05
Hi Joe,
> > HBase supports replication between clusters (i.e. data centers). > > That’s … debatable. There's replication support in the code, but > several times in the recent past when someone asked about it on this > mailing list, the response was ���I don't know of anyone actually > using it.” I believe SU uses it. Anyway I think this is really the point I was making here: > > the main difference is HBase provides simple mechanism and the user > > must build a replication architecture useful for them; while > > Cassandra attempts to hide some of that complexity So I don't think you nor I are debating this point really, except this: > My understanding of replication is that you can't replicate any > existing data, so unless you activated it on day one, it isn't very > useful. That was a design choice. Existing data should be transferred in advance or in background one-shot with a utility that chooses on an application-specific basis what is useful to replicate. There is also a generic utility provided as a MR job for this purpose. > If you use N=3, W=3, R=1 in Cassandra, you should get similar behavior > to HBase/HDFS with respect to consistency and availability My understanding is that R=1 does not guarantee that you won't see different versions of the data in different reads, in some scenarios. There was an excellent Quora answer in this regard, I don't remember it offhand, perhaps you can find the link to it or someone can provide it to you. > Random partitioning has definite advantages for some cases, and HBase > might well benefit from recognizing that and adding some support. Or just use salted keys? Random partitioning in a distributed ordered tree sounds like impedance mismatch to me. > HBase uses two different kinds of files, data files and logs, but > HDFS doesn't know about that and cannot, for example, optimize data > files for write throughput You are assuming that HDFS is a shrinkwrapped static thing here, no? Anyway, your point is valid, in the past features that HBase requires of HDFS have not received the level of support in the HDFS developer community that we would have liked. However this is now rapidly changing for the better. > Operationally, however, HBase is more complex. > Admins have to configure and manage ZooKeeper, HDFS, and HBase. > Could this be improved? Sure, there is room for improvement for hiding some of the complexity for evaluators or single system developers or other users who want e.g. a three step quickstart. Personally I prefer having the ability to tune those layers independent of each other. And, while complexity may be more "hidden" operationally in the Cassandra case relative to HBase, when there is a problem on your cluster, I don't know if that buys you anything. I suppose it depends on the nature of the problem. I do not believe there is a guarantee that operationally Cassandra is really simpler than HBase when it's 2 am and there is a bug and nodes are going down. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) >________________________________ >From: Joe Pallas <[EMAIL PROTECTED]> >To: [EMAIL PROTECTED] >Sent: Wednesday, August 31, 2011 1:42 AM >Subject: Re: HBase and Cassandra on StackOverflow > > >On Aug 30, 2011, at 2:47 AM, Andrew Purtell wrote: > >> Better to focus on improving HBase than play whack a mole. > >Absolutely. So let's talk about improving HBase. I'm speaking here as someone who has been learning about and experimenting with HBase for more than six months. > >> HBase supports replication between clusters (i.e. data centers). > >That’s … debatable. There's replication support in the code, but several times in the recent past when someone asked about it on this mailing list, the response was “I don't know of anyone actually using it.”�� My understanding of replication is that you can't replicate any existing data, so unless you activated it on day one, it isn't very useful. Do I misunderstand?
-
Re: HBase and Cassandra on StackOverflowAndrew Purtell 2011-08-31, 03:48
> Will the write call to HBase block until the record written is fully replicated ?
At the HDFS layer, hflush on the write ahead log will block until the data is fully replicated. At the HBase layer, whether the writer (client) will be blocked until HDFS layer actions complete depends on your settings regarding how the write ahead log operates -- do you have deferred flushing enabled on the table or globally, or not, for example -- and if the particular op has writeToWAL set to false. > then isn't there a window where a region server goes down, the data > might not be available anywhere else, until it comes back up When a regionserver process fails or the node upon which it is running crashes or is partitioned or whatever, store files, flush files, and write-ahead log data are all fully available by way of HDFS to any regionserver in the cluster taking over regions from the failed regionserver. There is a window of time where the data in the regions of the failed regionsever will not be available, until those regions are redeployed to live regionservers. This is because in the BigTable model, access to a region is available exclusively through an assigned regionserver, and is what I was alluding to when I said "On the other hand, HBase has its own and different scenarios where data may not be immediately available." Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) >________________________________ >From: Sam Seigal <[EMAIL PROTECTED]> >To: [EMAIL PROTECTED]; Andrew Purtell <[EMAIL PROTECTED]> >Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> >Sent: Wednesday, August 31, 2011 3:22 AM >Subject: Re: HBase and Cassandra on StackOverflow > >Will the write call to HBase block until the record written is fully >replicated ? If not (since it is happening at the block level), then isn't >there a window where a region server goes down, the data might not be >available anywhere else, until it comes back up ? > >On Tue, Aug 30, 2011 at 9:17 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote: > >> > Is the replication strategy for HBase completely reliant on HDFS' block >> > replication pipelining ? >> >> Yes. >> >> > Is this replication process asynchronous ? >> >> >> No. >> Best regards, >> >> >> - Andy >> >> Problems worthy of attack prove their worth by hitting back. - Piet Hein >> (via Tom White) >> >> >> >________________________________ >> >From: Sam Seigal <[EMAIL PROTECTED]> >> >To: [EMAIL PROTECTED]; Andrew Purtell <[EMAIL PROTECTED]> >> >Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> >> >Sent: Tuesday, August 30, 2011 7:35 PM >> >Subject: Re: HBase and Cassandra on StackOverflow >> > >> >A question inline: >> > >> >On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <[EMAIL PROTECTED]> >> wrote: >> > >> >> Hi Chris, >> >> >> >> Appreciate your answer on the post. >> >> >> >> Personally speaking however the endless Cassandra vs. HBase discussion >> is >> >> tiresome and rarely do blog posts or emails in this regard shed any >> light. >> >> Often, Cassandra proponents mis-state their case out of ignorance of >> HBase >> >> or due to commercial or personal agendas. It is difficult to find clear >> eyed >> >> analysis among the partisans. I'm not sure it will make any difference >> >> posting a rebuttal to some random thing jbellis says. Better to focus on >> >> improving HBase than play whack a mole. >> >> >> >> >> >> Regarding some of the specific points in that post: >> >> >> >> HBase is proven in production deployments larger than the largest >> publicly >> >> reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But >> basically >> >> this is the same order of magnitude, with HBase having a slight edge. I >> >> don't see a meaningful difference here. Stating otherwise is false. >> >> >> >> HBase supports replication between clusters (i.e. data centers). I >> believe, >> >> but admit I'm not super familiar with the Cassandra option here, that
-
Re: HBase and Cassandra on StackOverflowAndrew Purtell 2011-08-31, 03:52
> > Will the write call to HBase block until the record written is fully
> > replicated ? > no. data isn't written to disk immediately Not so black and white. Full replication in HDFS != writes to disk. Full replication is acknowledgement there are replicas at all DataNodes in the pipeline, and with rack-aware placement that is at least one non-rack-local replica. In practice this is good enough to give HDFS 5 or 6 nines of data availability; Hortonworks had a blog post about that recently. In our production we do have our DataNodes patched to call fsync() when a block write is completed. This will provide some marginal improvement over the default for the case where suddenly power is lost to the whole datacenter, but marginal is the key word here. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) >________________________________ >From: Joseph Boyd <[EMAIL PROTECTED]> >To: [EMAIL PROTECTED] >Sent: Wednesday, August 31, 2011 4:04 AM >Subject: Re: HBase and Cassandra on StackOverflow > >On Tue, Aug 30, 2011 at 12:22 PM, Sam Seigal <[EMAIL PROTECTED]> wrote: >> >> Will the write call to HBase block until the record written is fully >> replicated ? > >no. data isn't written to disk immediately > >> If not (since it is happening at the block level), then isn't >> there a window where a region server goes down, the data might not be >> available anywhere else, until it comes back up ? > >the data would be in the write ahead log. > > >...joe > > >> On Tue, Aug 30, 2011 at 9:17 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote: >> >> > > Is the replication strategy for HBase completely reliant on HDFS' block >> > > replication pipelining ? >> > >> > Yes. >> > >> > > Is this replication process asynchronous ? >> > >> > >> > No. >> > Best regards, >> > >> > >> > - Andy >> > >> > Problems worthy of attack prove their worth by hitting back. - Piet Hein >> > (via Tom White) >> > >> > >> > >________________________________ >> > >From: Sam Seigal <[EMAIL PROTECTED]> >> > >To: [EMAIL PROTECTED]; Andrew Purtell <[EMAIL PROTECTED]> >> > >Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> >> > >Sent: Tuesday, August 30, 2011 7:35 PM >> > >Subject: Re: HBase and Cassandra on StackOverflow >> > > >> > >A question inline: >> > > >> > >On Tue, Aug 30, 2011 at 2:47 AM, Andrew Purtell <[EMAIL PROTECTED]> >> > wrote: >> > > >> > >> Hi Chris, >> > >> >> > >> Appreciate your answer on the post. >> > >> >> > >> Personally speaking however the endless Cassandra vs. HBase discussion >> > is >> > >> tiresome and rarely do blog posts or emails in this regard shed any >> > light. >> > >> Often, Cassandra proponents mis-state their case out of ignorance of >> > HBase >> > >> or due to commercial or personal agendas. It is difficult to find clear >> > eyed >> > >> analysis among the partisans. I'm not sure it will make any difference >> > >> posting a rebuttal to some random thing jbellis says. Better to focus on >> > >> improving HBase than play whack a mole. >> > >> >> > >> >> > >> Regarding some of the specific points in that post: >> > >> >> > >> HBase is proven in production deployments larger than the largest >> > publicly >> > >> reported Cassandra cluster, ~1K versus 400 or 700 or somesuch. But >> > basically >> > >> this is the same order of magnitude, with HBase having a slight edge. I >> > >> don't see a meaningful difference here. Stating otherwise is false. >> > >> >> > >> HBase supports replication between clusters (i.e. data centers). I >> > believe, >> > >> but admit I'm not super familiar with the Cassandra option here, that >> > the >> > >> main difference is HBase provides simple mechanism and the user must >> > build a >> > >> replication architecture useful for them; while Cassandra attempts to >> > hide >> > >> some of that complexity. I do not know if they succeed there, but large >> > >> scale cross data center replication is rarely one size fits all so I
-
Re: HBase and Cassandra on StackOverflowTime Less 2011-08-31, 05:34
Most of your points are dead-on.
> Cassandra is no less complex than HBase. All of this complexity is > "hidden" in the sense that with Hadoop/HBase the layering is obvious -- > HDFS, HBase, etc. -- but the Cassandra internals are no less layered. > > Operationally, however, HBase is more complex. Admins have to configure > and manage ZooKeeper, HDFS, and HBase. Could this be improved? > I strongly disagree with the premise[1]. Having personally been involved in the Digg Cassandra rollout, and spent up until a couple months ago being in part-time weekly contact with the Digg Cassandra administrator, and having very close ties to the SimpleGeo Cassandra admin, I know it is a fickle beast. Having also spent a good amount of time at StumbleUpon and Mozilla (and now Riot Games) I also see first-hand that HBase is far more stable and -- dare I say it? -- operationally more simple. So okay, HBase is "harder to set up" if following a step-by-step guide on a wiki is "hard,"[2] but it's FAR easier to administer. Cassandra is rife with cascading cluster failure scenarios. I would not recommend running Cassandra in a highly-available high-volume data scenario, but don't hesitate to do so for HBase. I do not know if this is a guaranteed (provable due to architecture) result, or just the result of the Cassandra community being... how shall I say... hostile to administrators. But then, to me it doesn't matter. Results do. -- Tim Ellis Data Architect, Riot Games [1] That said, the other part of your statement is spot-on, too. It's surely possible to improve the HBase architecture or simplify it. [2] I went from having never set up HBase nor ever used Chef to having functional Chef recipes that installed a functional HBase/HDFS cluster in about 2 weeks. From my POV, the biggest stumbling point was that HDFS by default stores critical data in the underlying filesystem's /tmp directory by default, which is, for lack of a better word, insane. If I had to suggest how to simplify "HBase installation," I'd ask for sane HDFS config files that are extremely common and difficult-to-ignore.
-
Re: HBase and Cassandra on StackOverflowTime Less 2011-08-31, 05:47
> > If you use N=3, W=3, R=1 in Cassandra, you
> should get similar behavior > > to HBase/HDFS with respect to consistency > and availability > > My understanding is that R=1 does not guarantee that you won't see > different versions of the data in different reads, in some scenarios. There > was an excellent Quora answer in this regard, I don't remember it offhand, > perhaps you can find the link to it or someone can provide it to you. > Since this is fairly off-topic at this point, I'll keep it short. The simple rule for Dynamo goes like this: if (R+W>N && W>=Quorum), then you're guaranteed a consistent result always. You get eventual consistency if W>=Quorum. If W<Quorum, then you can get inconsistent data that must be detected/fixed by readers (often using timestamps or similar techniques). Joe is right, enforcing (W=3, R=1, N=3) on a Dynamo system gives the same (provably identical?) behaviour as HBase, with respect to consistency. -- Tim Ellis Data Architect, Riot Games
-
Re: HBase and Cassandra on StackOverflowAndrew Purtell 2011-08-31, 06:44
> > > If you use N=3, W=3, R=1 in Cassandra, you
> > > should get similar behavior > > > to HBase/HDFS with respect to consistency > > > and availability > > > > My understanding is that R=1 does not guarantee that you won't see > > different versions of the data in different reads, in some scenarios. > > Since this is fairly off-topic at this point, I'll keep it short. > The simple rule for Dynamo goes like this: if (R+W>N && W>=Quorum), > then you're guaranteed a consistent result always. Ok, I'll concede this point rather than go really off-topic with conjecture about corner cases, especially given I'm not a Cassandra expert by any means and could simply be mistaken. However this is still not quite HBase-equivalent consistency. HBase can provide CAS operations and atomic counters because only one regionserver at a time can mediate operations on a given row. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) From: Time Less <[EMAIL PROTECTED]> To: [EMAIL PROTECTED]; Andrew Purtell <[EMAIL PROTECTED]> Sent: Wednesday, August 31, 2011 1:47 PM Subject: Re: HBase and Cassandra on StackOverflow > > If you use N=3, W=3, R=1 in Cassandra, you > should get similar behavior > > to HBase/HDFS with respect to consistency > and availability > > My understanding is that R=1 does not guarantee that you won't see > different versions of the data in different reads, in some scenarios. There > was an excellent Quora answer in this regard, I don't remember it offhand, > perhaps you can find the link to it or someone can provide it to you. > Since this is fairly off-topic at this point, I'll keep it short. The simple rule for Dynamo goes like this: if (R+W>N && W>=Quorum), then you're guaranteed a consistent result always. You get eventual consistency if W>=Quorum. If W<Quorum, then you can get inconsistent data that must be detected/fixed by readers (often using timestamps or similar techniques). Joe is right, enforcing (W=3, R=1, N=3) on a Dynamo system gives the same (provably identical?) behaviour as HBase, with respect to consistency. -- Tim Ellis Data Architect, Riot Games
-
Re: HBase and Cassandra on StackOverflowEdward Capriolo 2011-08-31, 17:51
On Tue, Aug 30, 2011 at 1:42 PM, Joe Pallas <[EMAIL PROTECTED]>wrote:
> > On Aug 30, 2011, at 2:47 AM, Andrew Purtell wrote: > > > Better to focus on improving HBase than play whack a mole. > > Absolutely. So let's talk about improving HBase. I'm speaking here as > someone who has been learning about and experimenting with HBase for more > than six months. > > > HBase supports replication between clusters (i.e. data centers). > > That’s … debatable. There's replication support in the code, but several > times in the recent past when someone asked about it on this mailing list, > the response was “I don't know of anyone actually using it.” My > understanding of replication is that you can't replicate any existing data, > so unless you activated it on day one, it isn't very useful. Do I > misunderstand? > > > Cassandra does not have strong consistency in the sense that HBase > provides. It can provide strong consistency, but at the cost of failing any > read if there is insufficient quorum. HBase/HDFS does not have that > limitation. On the other hand, HBase has its own and different scenarios > where data may not be immediately available. The differences between the > systems are nuanced and which to use depends on the use case requirements. > > That's fair enough, although I think your first two sentences nearly > contradict each other :-). If you use N=3, W=3, R=1 in Cassandra, you > should get similar behavior to HBase/HDFS with respect to consistency and > availability ("strong" consistency and reads do not fail if any one copy is > available). > > A more important point, I think, is the one about storage. HBase uses two > different kinds of files, data files and logs, but HDFS doesn't know about > that and cannot, for example, optimize data files for write throughput (and > random reads) and log files for low latency sequential writes. (For > example, how could performance be improved by adding solid-state disk?) > > > Cassandra's RandomPartitioner / hash based partitioning means efficient > MapReduce or table scanning is not possible, whereas HBase's distributed > ordered tree is naturally efficient for such use cases, I believe explaining > why Hadoop users often prefer it. This may or may not be a problem for any > given use case. > > I don't think you can make a blanket statement that random partitioning > makes efficient MapReduce impossible (scanning, yes). Many M/R tasks > process entire tables. Random partitioning has definite advantages for some > cases, and HBase might well benefit from recognizing that and adding some > support. > > > Cassandra is no less complex than HBase. All of this complexity is > "hidden" in the sense that with Hadoop/HBase the layering is obvious -- > HDFS, HBase, etc. -- but the Cassandra internals are no less layered. > > Operationally, however, HBase is more complex. Admins have to configure > and manage ZooKeeper, HDFS, and HBase. Could this be improved? > > > With Cassandra, all RPC is via Thrift with various wrappers, so actually > all Cassandra clients are second class in the sense that jbellis means when > he states "Non-Java clients are not second-class citizens". > > That's disingenuous. Thrift exposes all of the Cassandra API to all of the > wrappers, while HBase clients who want to use all of the HBase API must use > Java. That can be fixed, but it is the status quo. > > joe > > Hooked into another Cassandra hbase thread... Cassandra's RandomPartitioner / hash based partitioning means efficient MapReduce or table scanning is not possible, whereas HBase's distributed ordered tree is naturally efficient for such use cases, I believe explaining why Hadoop users often prefer it. This may or may not be a problem for any given use case. Many people can and do benefit with this property of HBase. Efficient map/reduce still strikes me as an oxymoron :) Yes you can 'push down' something like 'WHERE key > x and key < y', It is pretty nifty. That does not really bring you all the way to complex queries. Cassandra now has support for built in secondary indexes, and I think soon users will be able to 'push down' where clauses for 'efficent' map reduce. Also you can currently range scan on columns (in both directions) in c* which are efficient. So if you can turn a key ranging design into a column ranging design you can get the same effect. With both systems Hbase and Cassandra you likely end up needing to design data around your queries. Cassandra is no less complex than HBase. All of this complexity is "hidden" in the sense that with Hadoop/HBase the layering is obvious -- HDFS, HBase, etc. -- but the Cassandra internals are no less layered. *This is an opinion*. I will disagree on this one. For example, The Cassandra gossip protocol exchanges two facts (IMHO) 'the state of the ring UP/DOWN' and the 'token ownership' of nodes. This information only changes when nodes join or leave the cluster. On the hbase side of things many small regions are splitting and moving often this involves communication between several components lets say master, zk, and region servers. One time setup complexity is one factor, monitoring and troubleshooting is another. You also have to consider: 1) making your Namenode actually redundant you need to depend on LinuxHa or multiple NFS servers 2) someway of protecting your masters/ZK nodes from processor/disk starvation (IE they need their own machine) 3) Java's semi-piggish memory usage profile, the fact that it rarely gives it back to the OS, so sharing a system with multiple Java processes is not ideal because each process tends to bubble up to higher then Xmx! (DataNode,Regionserver,TaskTracker) same box. The one JVM per node cassandra stack is less complex architecturally. I would argue administratively but I do not know of anyone with ROI numbers on ten node Cassandra vs Hbase clusters :)
-
Re: HBase and Cassandra on StackOverflowGary Helmling 2011-08-31, 18:21
> Since this is fairly off-topic at this point, I'll keep it short. The
> simple > rule for Dynamo goes like this: if (R+W>N && W>=Quorum), then you're > guaranteed a consistent result always. You get eventual consistency if > W>=Quorum. If W<Quorum, then you can get inconsistent data that must be > detected/fixed by readers (often using timestamps or similar techniques). > Joe is right, enforcing (W=3, R=1, N=3) on a Dynamo system gives the same > (provably identical?) behaviour as HBase, with respect to consistency. > > For those interested in a comparison of the consistency behavior, there's an older, but really excellent thread on quora with detailed analysis: http://www.quora.com/How-does-HBase-write-performance-differ-from-write-performance-in-Cassandra-with-consistency-level-ALL Don't miss the last answer in the the thread. It's unfortunately collapsed due to some quora policy, but it contains some of the best details.
-
Re: HBase and Cassandra on StackOverflowAndrew Purtell 2011-09-01, 04:35
> http://www.quora.com/How-does-HBase-write-performance-differ-from-write-performance-in-Cassandra-with-consistency-level-ALL Thanks, that was what I was referring to earlier in this thread. Now bookmarked. Comments there from those more knowledgable about Cassandra than I seem to indicate that N=3,W=3,R=1 is not practical (one commenter I know to be an expert characterizes it as "suicidal"), and the comments in the collapsed answer indicate there are corner cases known to Cassandra experts where HBase-equivalent strong consistency cannot be maintained even with that setting. So it seems that claims that Cassandra can provide consistency equivalent to HBase are erroneous. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) >________________________________ >From: Gary Helmling <[EMAIL PROTECTED]> >To: [EMAIL PROTECTED] >Sent: Thursday, September 1, 2011 2:21 AM >Subject: Re: HBase and Cassandra on StackOverflow > >> Since this is fairly off-topic at this point, I'll keep it short. The >> simple >> rule for Dynamo goes like this: if (R+W>N && W>=Quorum), then you're >> guaranteed a consistent result always. You get eventual consistency if >> W>=Quorum. If W<Quorum, then you can get inconsistent data that must be >> detected/fixed by readers (often using timestamps or similar techniques). >> Joe is right, enforcing (W=3, R=1, N=3) on a Dynamo system gives the same >> (provably identical?) behaviour as HBase, with respect to consistency. >> >> >For those interested in a comparison of the consistency behavior, there's an >older, but really excellent thread on quora with detailed analysis: >http://www.quora.com/How-does-HBase-write-performance-differ-from-write-performance-in-Cassandra-with-consistency-level-ALL > >Don't miss the last answer in the the thread. It's unfortunately collapsed >due to some quora policy, but it contains some of the best details. > > >
-
Re: HBase and Cassandra on StackOverflowEdward Capriolo 2011-09-01, 17:53
On Wed, Aug 31, 2011 at 1:34 AM, Time Less <[EMAIL PROTECTED]> wrote:
> Most of your points are dead-on. > > > Cassandra is no less complex than HBase. All of this complexity is > > "hidden" in the sense that with Hadoop/HBase the layering is obvious -- > > HDFS, HBase, etc. -- but the Cassandra internals are no less layered. > > > > Operationally, however, HBase is more complex. Admins have to configure > > and manage ZooKeeper, HDFS, and HBase. Could this be improved? > > > > I strongly disagree with the premise[1]. Having personally been involved in > the Digg Cassandra rollout, and spent up until a couple months ago being in > part-time weekly contact with the Digg Cassandra administrator, and having > very close ties to the SimpleGeo Cassandra admin, I know it is a fickle > beast. Having also spent a good amount of time at StumbleUpon and Mozilla > (and now Riot Games) I also see first-hand that HBase is far more stable > and > -- dare I say it? -- operationally more simple. > > So okay, HBase is "harder to set up" if following a step-by-step guide on a > wiki is "hard,"[2] but it's FAR easier to administer. Cassandra is rife > with > cascading cluster failure scenarios. I would not recommend running > Cassandra > in a highly-available high-volume data scenario, but don't hesitate to do > so > for HBase. > > I do not know if this is a guaranteed (provable due to architecture) > result, > or just the result of the Cassandra community being... how shall I say... > hostile to administrators. But then, to me it doesn't matter. Results do. > > -- > Tim Ellis > Data Architect, Riot Games > [1] That said, the other part of your statement is spot-on, too. It's > surely > possible to improve the HBase architecture or simplify it. > [2] I went from having never set up HBase nor ever used Chef to having > functional Chef recipes that installed a functional HBase/HDFS cluster in > about 2 weeks. From my POV, the biggest stumbling point was that HDFS by > default stores critical data in the underlying filesystem's /tmp directory > by default, which is, for lack of a better word, insane. If I had to > suggest > how to simplify "HBase installation," I'd ask for sane HDFS config files > that are extremely common and difficult-to-ignore. > Why are you quoting "harder" what was said was "more complex". Setting up N things is more complex then setting up a single thing. First, you have to learn: 1) Linux HA 2) DRDB Right out of the gate just to have a redundant name node. This is not easy, fast, or simple. In fact this is quite a pain. http://docs.google.com/viewer?a=v&q=cache:9rnx-eRzi1AJ:files.meetup.com/1228907/Hadoop%2520Namenode%2520High%2520Availability.pptx+linux+ha+namenode&hl=en&gl=us&pid=bl&srcid=ADGEESig5aJNVAXbLgBwyc311sPSd88jUJbKHx4z2PQtDKHnmM1FuCJpg2IUyqi5JrmUL3RbCb8QRYsjHnP74YuKQfOQXoUZxnhrCy6N1kVpiG1jNi4zhqoKlUTmoDaqS1NegCFb6-WM&sig=AHIEtbQbjN1Olwxui5JmywdWzhqv4Hq3tw&pli=1 Doing it properly involves setting up physical wires between servers or link aggregation groups. You can't script having someone physically run crossover cables. You need your switching engineer to set up LAG's. Also you may notice that everyone that describes this setup is also describing it using linux-ha V1 which was deprecated for over 2 years. Which also demonstrates how this process is so complicated people tend to touch it and never touch it again because of how fragile it is. You are also implying that following the wiki is easy. Personally, I find that the wiki has fine detail, but it is confusing. Here is why. "1.3.1.2. hadoop This version of HBase will only run on Hadoop 0.20.x. It will not run on hadoop 0.21.x (nor 0.22.x). HBase will lose data unless it is running on an HDFS that has a durable sync. Currently only the branch-0.20-append branch has this attribute[1]. No official releases have been made from this branch up to now so you will have to build your own Hadoop from the tip of this branch. Michael Noll has written a detailed blog, Building an Hadoop 0.20.x version for HBase 0.90.2, on how to build an Hadoop from branch-0.20-append. Recommended. Or rather than build your own, you could use Cloudera's CDH3. CDH has the 0.20-append patches needed to add a durable sync (CDH3 betas will suffice; b2, b3, or b4)." So the setup starts by recommending rolling your own hadoop (pain in the ass). OR using a beta ( :( ). Then it gets onto hbase it branches into “Standalone HBase” and Section 1.3.2.2, “Distributed” Then it branches into "psuedo distributed" and "full distributed" , then the zookeeper section offers you two options "1.3.2.2.2.2. ZooKeeper", "1.3.2.2.2.2.1. Using existing ZooKeeper ensemble" . Not to say this is hard or impossible, but it is a lot of information to digest and all the branching decisions are hard to understand to a first time user. Uppercasing the word FAR does not prove to me that hbase is easier to administer nor does the your employment history or second hand stories unnamed from people you know. I can tell you why I think Cassandra is easier to manage: 1) There is only one log file /var/log/cassandra/system.log 2) There is only one configuration folder /usr/local/cassandra/conf/cassandra.yaml cassandra-env.sh 3) I do not need to keep a chart or post it notes where all these 1 off components are. zk server list, hbase master server list, namenode, 4) No need to configure auxiliary stuff such as DRBD or Linux-HA *Fud ALARM* "Cassandra is rife with cascading cluster failure scenarios." ....and hbase never has issues apparently. (remember I am on both lists) Also... [2] I went from having never set up HBase nor ever used Chef to having functional Chef recipes that installed a functional HBase/HDFS cluster in about 2 weeks. It took me about one hour to accomplish the same result with puppet + cassandra. http://www.jointhegrid.com/highperfcassandra/?p=62
-
Re: HBase and Cassandra on StackOverflowRyan Rawson 2011-09-01, 19:12
On Thu, Sep 1, 2011 at 10:53 AM, Edward Capriolo <[EMAIL PROTECTED]> wrote:
> On Wed, Aug 31, 2011 at 1:34 AM, Time Less <[EMAIL PROTECTED]> wrote: > >> Most of your points are dead-on. >> >> > Cassandra is no less complex than HBase. All of this complexity is >> > "hidden" in the sense that with Hadoop/HBase the layering is obvious -- >> > HDFS, HBase, etc. -- but the Cassandra internals are no less layered. >> > >> > Operationally, however, HBase is more complex. Admins have to configure >> > and manage ZooKeeper, HDFS, and HBase. Could this be improved? >> > >> >> I strongly disagree with the premise[1]. Having personally been involved in >> the Digg Cassandra rollout, and spent up until a couple months ago being in >> part-time weekly contact with the Digg Cassandra administrator, and having >> very close ties to the SimpleGeo Cassandra admin, I know it is a fickle >> beast. Having also spent a good amount of time at StumbleUpon and Mozilla >> (and now Riot Games) I also see first-hand that HBase is far more stable >> and >> -- dare I say it? -- operationally more simple. >> >> So okay, HBase is "harder to set up" if following a step-by-step guide on a >> wiki is "hard,"[2] but it's FAR easier to administer. Cassandra is rife >> with >> cascading cluster failure scenarios. I would not recommend running >> Cassandra >> in a highly-available high-volume data scenario, but don't hesitate to do >> so >> for HBase. >> >> I do not know if this is a guaranteed (provable due to architecture) >> result, >> or just the result of the Cassandra community being... how shall I say... >> hostile to administrators. But then, to me it doesn't matter. Results do. >> >> -- >> Tim Ellis >> Data Architect, Riot Games >> [1] That said, the other part of your statement is spot-on, too. It's >> surely >> possible to improve the HBase architecture or simplify it. >> [2] I went from having never set up HBase nor ever used Chef to having >> functional Chef recipes that installed a functional HBase/HDFS cluster in >> about 2 weeks. From my POV, the biggest stumbling point was that HDFS by >> default stores critical data in the underlying filesystem's /tmp directory >> by default, which is, for lack of a better word, insane. If I had to >> suggest >> how to simplify "HBase installation," I'd ask for sane HDFS config files >> that are extremely common and difficult-to-ignore. >> > > Why are you quoting "harder" what was said was "more complex". Setting up N > things is more complex then setting up a single thing. > > First, you have to learn: > 1) Linux HA > 2) DRDB > > Right out of the gate just to have a redundant name node. Eh, no one would do that. If you want a redundant name node your only choice is to use Mapr, which I would def recommend since you get a better nn "fail-over" w/o service interruption and significantly higher performance than hdfs. > > This is not easy, fast, or simple. In fact this is quite a pain. > http://docs.google.com/viewer?a=v&q=cache:9rnx-eRzi1AJ:files.meetup.com/1228907/Hadoop%2520Namenode%2520High%2520Availability.pptx+linux+ha+namenode&hl=en&gl=us&pid=bl&srcid=ADGEESig5aJNVAXbLgBwyc311sPSd88jUJbKHx4z2PQtDKHnmM1FuCJpg2IUyqi5JrmUL3RbCb8QRYsjHnP74YuKQfOQXoUZxnhrCy6N1kVpiG1jNi4zhqoKlUTmoDaqS1NegCFb6-WM&sig=AHIEtbQbjN1Olwxui5JmywdWzhqv4Hq3tw&pli=1 > > Doing it properly involves setting up physical wires between servers or link > aggregation groups. You can't script having someone physically run crossover > cables. You need your switching engineer to set up LAG's. > Also you may notice that everyone that describes this setup is also > describing it using linux-ha V1 which was deprecated for over 2 years. Which > also demonstrates how this process is so complicated people tend to touch it > and never touch it again because of how fragile it is. > > You are also implying that following the wiki is easy. Personally, I find > that the wiki has fine detail, but it is confusing. > Here is why. > > "1.3.1.2. hadoop > Moving forward, my plan is to only deploy HBase on top of mapr for real-time situations where at all possible. HDFS isn't there yet, 2.5 years ago I was optimistic, and they still have more years to go. In the mean time, with mapr you get yourself HA, better performance, and hopefully better error recovery. Just as an aside, no one does #4. As for #3, what you are really saying is "i dont want to have good sysadmin/automation practices" - sure a lot of people don't, but if you do, #3 is a non-issue. Chef can help. This is not FUD, its a legitimate concern. The issue isn't if one system has failures or not, because all fail, but HOW they fail. And that also leads to HOW you determine what the root cause it, and HOW you recover. This sounds like a difference of opinion, but there are practicalities of how you admin and deal with 3am failure modes. I think this is the place where HBase shines very well, but this is a story you can't tell without people crying "FUD" since it's complex and thus doesn't translate well. I also would posit that the HBase master is a _good_ thing. It provides a management point, it doesnt participate in the query path, and is not a major scaling issue. It lets you give definitive answers to things like "how busy is my cluster" and "what is online/offline" "what tables are there" etc etc. It handles failures in a highly explicit manner, which is good.
-
Re: HBase and Cassandra on StackOverflowTime Less 2011-09-01, 22:13
Why are you quoting "harder" what was said was "more complex". Setting up N
> things is more complex then setting up a single thing. > Okay. Sorry for misinterpreting your meaning. You're right, it's more complex to set up. You are also implying that following the wiki is easy. Personally, I find > that the wiki has fine detail, but it is confusing. > True. Running a world-class distributed database isn't trivial. And yeah, sorry for implying following the wiki is easy. It was for me, but that may not be for others. > Uppercasing the word FAR does not prove to me that hbase is easier to > administer nor does the your employment history or second hand stories > unnamed from people you know. A lot of people think credentials are important, especially in this particular debate of Cassandra vs. HBase, where obviously technical details are ignored. My point is, I've worked extremely closely with the flagship deploys of both (Apache) Cassandra and HBase and continue to work closely with the people who still have to run this stuff at volume today. I'm sorry you don't find these details important. [2] I went from having never set up HBase nor ever used Chef to having > functional Chef recipes that installed a functional HBase/HDFS cluster in > about 2 weeks. > > It took me about one hour to accomplish the same result with puppet + > cassandra. > http://www.jointhegrid.com/highperfcassandra/?p=62 > Something being easy to set up is entirely different than it working at scale. Note I don't mention how long it took me to set up SQL Lite or write Chef recipes for it. The whole point of Puppet and Chef is to manage complexity, which you'll need when running a world-class distributed database. -- Tim Ellis Data Architect, Riot Games
-
Re: HBase and Cassandra on StackOverflowArun C Murthy 2011-09-01, 22:37
On Sep 1, 2011, at 10:53 AM, Edward Capriolo wrote: > "1.3.1.2. hadoop > > This version of HBase will only run on Hadoop 0.20.x. It will not run on > hadoop 0.21.x (nor 0.22.x). HBase will lose data unless it is running on an > HDFS that has a durable sync. Currently only the branch-0.20-append branch > has this attribute[1]. No official releases have been made from this branch > up to now so you will have to build your own Hadoop from the tip of this > branch. Michael Noll has written a detailed blog, Building an Hadoop 0.20.x > version for HBase 0.90.2, on how to build an Hadoop from branch-0.20-append. > Recommended. > > Or rather than build your own, you could use Cloudera's CDH3. CDH has the > 0.20-append patches needed to add a durable sync (CDH3 betas will suffice; > b2, b3, or b4)." > > So the setup starts by recommending rolling your own hadoop (pain in the > ass). OR using a beta ( :( ). It's early days yet, but we seem to be converging towards having a grand unification of security and append patchsets for hadoop-0.20.205 release from Apache. Arun
-
Re: HBase and Cassandra on StackOverflowAndrew Purtell 2011-09-02, 00:41
> So the setup starts by recommending rolling your own hadoop (pain in the
> ass). OR using a beta ( :( ). CDH3 is not in beta. The latest version is release, CDH3U1. I think most people at this point will just use CDH, so all of that about rolling your own compile of Hadoop sources -- that is hard? ("ant") -- is a non-issue. > First, you have to learn: > 1) Linux HA > 2) DRDB > > Right out of the gate just to have a redundant name node. Likewise HA namenode. Most don't do that I suspect. However, we did. Having a modicum of Linux system administration experience, we were already familiar with DRDB and the RHEL Cluster Suite, so this was not anything we had not seen before. Maybe you are arguing Cassandra is easier for noobs to set up? I guess that's great. But I would not want such a person running my production, and I can't see how any serious person would. > *Fud ALARM* "Cassandra is rife with cascading cluster failure > scenarios." > ....and hbase never has issues apparently. (remember I am on both lists) What Ryan said regarding this, I agree completely. I've had occasion over the years to wrangle both master-slave and peer-to-peer systems in various failure modes. In many cases a master gives you a single point of control to regain control of an errant system. There is no such thing in a P2P system, you have to shut down everything and reinitialize. However, refer to my response to the mail that started this thread. Whether master-slave or P2P architecture is appropriate for a given use case involves a series of trade offs. There is no simple answer. Neither is superior to the other. Best regards, �� - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) ----- Original Message ----- > From: Edward Capriolo <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Cc: > Sent: Friday, September 2, 2011 1:53 AM > Subject: Re: HBase and Cassandra on StackOverflow > > On Wed, Aug 31, 2011 at 1:34 AM, Time Less <[EMAIL PROTECTED]> wrote: > >> Most of your points are dead-on. >> >> > Cassandra is no less complex than HBase. All of this complexity is >> > "hidden" in the sense that with Hadoop/HBase the layering is > obvious -- >> > HDFS, HBase, etc. -- but the Cassandra internals are no less layered. >> > >> > Operationally, however, HBase is more complex. Admins have to > configure >> > and manage ZooKeeper, HDFS, and HBase. Could this be improved? >> > >> >> I strongly disagree with the premise[1]. Having personally been involved in >> the Digg Cassandra rollout, and spent up until a couple months ago being in >> part-time weekly contact with the Digg Cassandra administrator, and having >> very close ties to the SimpleGeo Cassandra admin, I know it is a fickle >> beast. Having also spent a good amount of time at StumbleUpon and Mozilla >> (and now Riot Games) I also see first-hand that HBase is far more stable >> and >> -- dare I say it? -- operationally more simple. >> >> So okay, HBase is "harder to set up" if following a step-by-step > guide on a >> wiki is "hard,"[2] but it's FAR easier to administer. > Cassandra is rife >> with >> cascading cluster failure scenarios. I would not recommend running >> Cassandra >> in a highly-available high-volume data scenario, but don't hesitate to > do >> so >> for HBase. >> >> I do not know if this is a guaranteed (provable due to architecture) >> result, >> or just the result of the Cassandra community being... how shall I say... >> hostile to administrators. But then, to me it doesn't matter. Results > do. >> >> -- >> Tim Ellis >> Data Architect, Riot Games >> [1] That said, the other part of your statement is spot-on, too. It's >> surely >> possible to improve the HBase architecture or simplify it. >> [2] I went from having never set up HBase nor ever used Chef to having >> functional Chef recipes that installed a functional HBase/HDFS cluster in >> about 2 weeks. From my POV, the biggest stumbling point was that HDFS by
-
RE: HBase and Cassandra on StackOverflowMichael Segel 2011-09-02, 00:47
> Date: Thu, 1 Sep 2011 15:13:13 -0700 > Subject: Re: HBase and Cassandra on StackOverflow > From: [EMAIL PROTECTED] > To: [EMAIL PROTECTED] > [BIG SNIP] While you guys are going back and forth... a simple reminder. Not everyone has the same base level of experience so their ability to 'cookbook' an install will vary. Not every company is trying to solve the same problem so that in using either beast, their experiences will vary. Can't we just all get along? :-) Both HBase and Cassandra are tools. While similar they have a different fit within the portfolio of options. Its like being on a golf course and trying to decide if you should use a 7 iron or an 8 iron ... -Mike
-
Re: HBase and Cassandra on StackOverflowAndrew Purtell 2011-09-02, 02:27
> From: Michael Segel <[EMAIL PROTECTED]>
> Can't we just all get along? :-) My personal introduction to Cassandra came maybe in the 2009 timeframe. We evaluated it and HBase at the time and chose HBase. No point to discuss why, the world has changed many times over. From there, my involvement in the HBase project grew and I didn't think of or hear about Cassandra for a long time. Then began an aggressive marketing campaign by Cassandra proponents that spoke negatively about HBase at every opportunity. It was everywhere whether one cared about such things or not. There was also an untrue (but easy to fudge with "marketing" given the technology differences are complex and nuanced) and quite insulting assertion that Cassandra is a superset of HBase. I believe this persists even today. So, no, frankly we cannot get along. If this were a passionless technical discussion that would not be the case. However, from my perspective, which I believe is shared by others on the HBase side, the Cassandra project is run by asshats and some of their boosters share that unfortunate trait. At this point I'd like to go back to ignoring Cassandra, and hopefully will not have occasion to deal with "Cassandra vs. HBase" again for many months. Best regards, - Andy
-
Re: HBase and Cassandra on StackOverflowJoseph Pallas 2011-09-02, 17:27
Drifting off topic a bit …
On Sep 1, 2011, at 12:12 PM, Ryan Rawson wrote: >> First, you have to learn: >> 1) Linux HA >> 2) DRDB >> >> Right out of the gate just to have a redundant name node. > > Eh, no one would do that. If you want a redundant name node your only > choice is to use Mapr, which I would def recommend since you get a > better nn "fail-over" w/o service interruption and significantly > higher performance than hdfs. Really? People running offline analytics may be fine with an hour of downtime [<http://hadoopblog.blogspot.com/2010/02/hadoop-namenode-high-availability.html> <http://www.hortonworks.com/data-integrity-and-availability-in-apache-hadoop-hdfs/>] for their M/R jobs, but people running interactive services do not find that acceptable. Is my only option to avoid significant downtime in the event of a name node failure a closed-source offering that has already demonstrated at least one serious data-loss issue <http://answers.mapr.com/questions/415/hbase-table-disappear-after-failover-attempt-and-fall-back>? I don’t really mean to criticize MapR: they were victims of a hidden dependency, but that’s what happens when you replace part of an integrated stack. And that is why I find your suggestion that I should not expect to use the integrated stack a little unnerving, because I'm looking at HBase for an online application. joe
-
Re: HBase and Cassandra on StackOverflowRyan Rawson 2011-09-02, 17:47
On Fri, Sep 2, 2011 at 10:27 AM, Joseph Pallas <[EMAIL PROTECTED]> wrote:
> Drifting off topic a bit … > > On Sep 1, 2011, at 12:12 PM, Ryan Rawson wrote: > >>> First, you have to learn: >>> 1) Linux HA >>> 2) DRDB >>> >>> Right out of the gate just to have a redundant name node. >> >> Eh, no one would do that. If you want a redundant name node your only >> choice is to use Mapr, which I would def recommend since you get a >> better nn "fail-over" w/o service interruption and significantly >> higher performance than hdfs. > > Really? People running offline analytics may be fine with an hour of downtime [<http://hadoopblog.blogspot.com/2010/02/hadoop-namenode-high-availability.html> <http://www.hortonworks.com/data-integrity-and-availability-in-apache-hadoop-hdfs/>] for their M/R jobs, but people running interactive services do not find that acceptable. > > Is my only option to avoid significant downtime in the event of a name node failure a closed-source offering that has already demonstrated at least one serious data-loss issue <http://answers.mapr.com/questions/415/hbase-table-disappear-after-failover-attempt-and-fall-back>? Well, actually... yes. HA/DRDB flip will take at the very least 10-30 seconds, and possibly 10 minutes or longer if your cluster is really big. Avatar node presumes a $250k netapp, and still has a 10-30 second flip time once you trigger it. The NN-HA work is still WIP. You could always use ceph, right? > > I don’t really mean to criticize MapR: they were victims of a hidden dependency, but that’s what happens when you replace part of an integrated stack. And that is why I find your suggestion that I should not expect to use the integrated stack a little unnerving, because I'm looking at HBase for an online application. >
-
Re: HBase and Cassandra on StackOverflowJacques 2011-09-02, 18:18
Don't forget that Gluster just released a beta open source Hadoop connector.
Their "we'll just a dip a toe in the hadoop community" approach doesn't inspire confidence. On the other hand, they have a decent track record regarding larger HA file system setups and offer many things that MapR offers (e.g. NFS, built as a distributed file system since day 1, etc.) along with open source. I agree with Joe that there aren't great options with regards to HA. 1. If you're okay with closed source and a big price tag (list is like 4k/node), MapR is probably your best option. 2. If you're Facebook or Yahoo you can make a solution work because you have the manpower. 3. If you're not either 1 or 2, you're kinda stuck on the Hadoop side of things--you use the best hardware you can for the namenode and use either DRBD or a redundant SAN (which can be had for much less than 250k) I strongly believe that things have the potential to change substantially within the next 12 months. (More optimistic than Ryan-- maybe because he has seen the Hadoop community thrashing for longer). And yes, ceph is getting closer all the time.
-
Re: HBase and Cassandra on StackOverflowTime Less 2011-09-02, 20:44
> > Can't we just all get along? :-)
> ... > So, no, frankly we cannot get along. If this were a passionless technical > discussion that would not be the case. However, from my perspective, which I > believe is shared by others on the HBase side, the Cassandra project is run > by asshats and some of their boosters share that unfortunate trait. > I have an even more unfortunate position than Andy in this. I started out as perhaps the one administrator who was most qualified to be an Apache Cassandra proponent. I was the data director at the flagship premier big-data shop that was to be using this newly open-sourced Apache Cassandra. When the FUD started flying, I had the unfortunate circumstance to decide: should I continue to be part of this community that Andy so accurately portrays as lead by a bunch of asshats, or should I move on to a community of professionals that care about technology and big data, rather than self image? I guess in hindsight the decision wasn't so hard. At this point I'd like to go back to ignoring Cassandra, and hopefully will > not have occasion to deal with "Cassandra vs. HBase" again for many months. > Hear hear! -- Tim Ellis Data Architect, Riot Games
-
Re: HBase and Cassandra on StackOverflowJeremy Hanna 2011-09-05, 19:41
> From: Michael Segel <[EMAIL PROTECTED]>
> Can't we just all get along? :-) > My personal introduction to Cassandra came maybe in the 2009 timeframe. We > evaluated it and HBase at the time and chose HBase. No point to discuss why, > the world has changed many times over. > From there, my involvement in the HBase project grew and I didn't think of or > hear about Cassandra for a long time. > Then began an aggressive marketing campaign by Cassandra proponents that spoke > negatively about HBase at every opportunity. It was everywhere whether one > cared about such things or not. There was also an untrue (but easy to fudge > with "marketing" given the technology differences are complex and nuanced) and > quite insulting assertion that Cassandra is a superset of HBase. I believe this > persists even today. > So, no, frankly we cannot get along. If this were a passionless technical > discussion that would not be the case. However, from my perspective, which I > believe is shared by others on the HBase side, the Cassandra project is run by > asshats and some of their boosters share that unfortunate trait. Both projects are the result of a lot of work. There are misunderstandings. Hopefully with more open discussion, fewer accusations, and more background understanding, the level of vitriol can settle down and both projects can learn from each other. I really appreciate people helping me understand and I've been learning more about hbase through the apache online book recently. I hope that those with negative experiences with past versions of cassandra can accept that the project may have progressed as hbase has also progressed. I don't think one project will succeed at the complete expense of the other. I know abrasive people in both projects - I don't think either one has a monopoly there :-). I'm impressed by how far both have come in the last year or two though. Cheers, Jeremy
-
RE: HBase and Cassandra on StackOverflowMichael Segel 2011-09-05, 22:06
Ok... You can look at it this way... You have Cassandra that hasn't gotten a lot of traction nor has it created enough of a critical mass to be considered long term viable... unlike Hadoop/HBase which has enough critical mass and is viable long term. So do you get in to a tit for tat battle or do you say 'whatever' and move on. The point is that Hadoop is on everyone's radar while Cassandra isn't. (Cassandra is just another NoSQL database, not a framework. ) So to make Cassandra relevant, it picks a fight with the biggest kid on the block. (Ok not the biggest kid but you get the idea.) I think it would be a good idea to just ignore Cassandra. The Hadoop community has nothing to gain in getting in to it with someone making unsubstantiated comments. JMHO... -Mike > Date: Thu, 1 Sep 2011 19:27:36 -0700 > From: [EMAIL PROTECTED] > Subject: Re: HBase and Cassandra on StackOverflow > To: [EMAIL PROTECTED] > > > From: Michael Segel <[EMAIL PROTECTED]> > > > Can't we just all get along? :-) > > My personal introduction to Cassandra came maybe in the 2009 timeframe. We evaluated it and HBase at the time and chose HBase. No point to discuss why, the world has changed many times over. > > From there, my involvement in the HBase project grew and I didn't think of or hear about Cassandra for a long time. > > Then began an aggressive marketing campaign by Cassandra proponents that spoke negatively about HBase at every opportunity. It was everywhere whether one cared about such things or not. There was also an untrue (but easy to fudge with "marketing" given the technology differences are complex and nuanced) and quite insulting assertion that Cassandra is a superset of HBase. I believe this persists even today. > > So, no, frankly we cannot get along. If this were a passionless technical discussion that would not be the case. However, from my perspective, which I believe is shared by others on the HBase side, the Cassandra project is run by asshats and some of their boosters share that unfortunate trait. > > At this point I'd like to go back to ignoring Cassandra, and hopefully will not have occasion to deal with "Cassandra vs. HBase" again for many months. > > Best regards, > > - Andy |