Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBase Stack

Copy link to this message
Re: HBase Stack

To add to what Joey said, consider that there are very significant trade-offs you make when building something on HBase (or any of the new generation of non-relational databases). For starters, you don't get:

 - A declarative query language like SQL that can build optimal physical access plans from arbitrarily complex logical queries
 - Secondary indexing (so if you want to look things up by something other than the primary key, you can't do it without a full table scan)
 - Multi-row or multi-object suspended transactions (so you can't just "roll back" a set of changes like you can in a relational database, nor can you keep operations isolated from other concurrent readers until they commit)

Scalable data storage systems like HBase may eventually make up for these deficiencies, but that hasn't happened yet. Today, using HBase is only appropriate if you have a really large amount of data and you can predict and design for pretty much all of your access patterns up front.


On Nov 14, 2011, at 8:14 AM, Joey Echeverria wrote:

> I don't think I would try to use a single-node HBase cluster to
> replace a MySQL database. HBase has a sweet spot, both in terms of
> scale and data access patterns. In general, it should not be viewed as
> a drop in replacement to MySQL. My questions to you would be:
> 1) How much data do you need to store?
> 2) What are your access patterns? Lots of joins, individual row
> lookups, range scans, etc.
> -Joey
> On Mon, Nov 14, 2011 at 1:54 AM, Em <[EMAIL PROTECTED]> wrote:
>> Hello list,
>> I was asked whether it is a good idea to replace the M in LAMP with
>> Hbase as well as the P with Java-Servlet (i.e. Tomcat) so that you run
>> your webserver, your hbase-instance, hadoop etc. on the same machine.
>> Are the differences compared to a LAMP-Stack in terms of performance large?
>> It is clear that a lot of benefits like redundancy etc. are not
>> available in this setup. However if the idea and userbase grows you can
>> quickly add these features to the environment by just setting up new
>> machines and connect them with eachother.
>> When I was asked about this I had no answer.
>> Hopefully you can bring light into this!
>> Kind regards,
>> Em
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434