Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Question about HBase for OLTP

Copy link to this message
Re: Question about HBase for OLTP

1) Eventual Consistency isn't a problem here.  HBase is a strict
consistency system.  Maybe you have us confused with other Dynamo-based
Open Source projects?
2) MySQL and other traditional RDBMS systems are definitely a lot more
solid, well-tested, and subtlety tuned than HBase.  The vast majority (if
not all) of database systems developed in the past decade have this
provlem.  HBase has 2 main advantages over a traditional RDBMS workload
for OLTP:
  A. Large-scale workloads : Facebook Messages have a constant growing set
of data that is 1PB+.  And we're growing at 250MB/month.  This is hard to
manage this with a traditional RDBMS.  Logical database sharding is
extremely useful.
  B. Write-dominated workloads : Examples like time-series databases, user
analytics, etc are very write-heavy. A LSMT approach is architecturally
better than a B-tree approach.  Having done system testing internally, we
already see IOPS advantage with HBase over MySQL in writes.
3) A big question is what you need out of a database system.  Most web
companies are worried about the 'large-scale workloads' problem if their
site becomes popular, so a working familiarity with a distributed database
system for less mission-critical applications is worthwhile even if the
performance and reliability isn't there yet.
4) If you have any mission-critical data, you really should think about a
disaster recovery plan outside of HBase, which is not as critical with a
traditional RDBMS.  Facebook Messages ends up using Scribe as a backup
mechanism.  We are currently working on HBase Snapshots to allow disaster
recovery with HBase alone, but you shouldn't hedge bets on it being
completed within your timeframe.
On 1/9/12 2:31 PM, "Michael Segel" <[EMAIL PROTECTED]> wrote:

>Just my $0.02 worth of 'expertise'...
>1) Just because you can do something doesn't mean you should.
>2) One should always try to use the right tool for the job regardless of
>your 'fashion sense'.
>3) Just because someone says "Facebook or Yahoo! does X", doesn't mean
>its a good idea, or its the right choice for you and your team.
>Having said that...
>Yes, you can use HBase to handle OLTP queries. However you do not have
>transactional capabilities built in such that you will have to manage
>them within your application.
>Not really an easy task when you think about it.  It really depends on
>what you want to do with your OLTP system. Hotel reservation systems not
>really a good idea....
>There are some inherent problems with HBase in an OLTP environment.
>1) Eventual consistency. You can google the CAP theorem and you'll see
>why this is an issue.
>2) Lack of transaction support. Note: Row Level Locking that is in HBase
>has nothing to do with Row Level Locking with respect to transactional
>3) HBase size and scale vs RDBMS. For OLTP, RDBMS is the best tool for
>the job. So why do you want to use HBase over what one could call the
>'defacto' standard?
>The point here on #3 is that the normal tool of choice is an RDBMS. So
>you really, really need to justify why you're not going with this. I mean
>there could be a valid reason, but in most cases no.
>Where dhruba indicates that HBase is a pure transaction system, and does
>support OLTP workloads... absolutely Not!
>So what I suggest is that if you want to do OLTP in HBase, the first
>thing you have to do is to prove that you can't solve the problem in an
>Having said all that... I'm going to shut now... ;-)
>> Date: Mon, 9 Jan 2012 10:55:45 -0800
>> Subject: Re: Question about HBase for OLTP
>> > I know HBase is designed for OLAP, query intensive type of
>> That is not entirely true. HBase is a pure transaction system and does
>> workloads for us. We probably more than 2 millions ops/sec for one of
>> application, details here:
>> https://www.facebook.com/note.php?note_id=454991608919