-Re: HBase & BigTable + History: Can it run decently on a 512MB machine? What's the difference between the two?
HBase is an open source project, so you can read the source code and make that determination for yourself. It was first created based on the same ideas in the Bigtable paper (published by Google) but is only related based on the design goals and philosophy, not the actual implementation.
BigTable, conversely, is a proprietary system design and run by Google. They don't share the source code, nor license it outside of Google in any way. So if you want an actual comparison, you'll have to go work at Google. :)
I don't think there's anyone claiming that HBase = Bigtable; simply that it's based on the same ideas, and is intended as an open source implementation of the same concept.
On Mar 5, 2012, at 6:28 PM, D S wrote:
Simple, I want to see what is meant by the claim that HBase = Big Table.
How far does this claim go?
How identical are the two products? Does it stop at
the fronted specifications? Does it go into the internals? I just want to
know how identical these two products are and how different are the two.
If I took the current build of HBase and had a time machine and installed
it in all those circa 2003 Google servers (and not one server more), would
I end up with something similar to what Google had back then?
Is there anyone in this mailing list who has any experience w/ BigTable
On Mon, Mar 5, 2012 at 5:12 PM, lars hofhansl <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
This is a hypothetical question. Why do you care?
Can you run current Windows on '03 machines? Or Linux (with KDE/Gnome)?
HBase is designed for modern machines.
From: D S <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Sent: Monday, March 5, 2012 11:39 AM
Subject: Re: HBase & BigTable + History: Can it run decently on a 512MB
machine? What's the difference between the two?
On 3/5/12, Michael Drzal <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
You really need to consider the entire historical context here. A lot of
the memory used in hbase is buffering writes to disk and for the block
cache. These days, it isn't unreasonable to get 12 2-3TB disks in a
commodity server. Back in 2003, you would not get as many disks, and
would be much smaller. One way to think about it is the ratio of
space or more operationally what your cache hit ratio is and how busy
disk drives are.
On Mon, Mar 5, 2012 at 3:25 AM, D S <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
I'm learning more about HBase and I'm curious how much of HBase is
actually based on Google's original dB. In Google's origins stories,
they are well known for using low cost commodity hardware in scale in
order to store their web database.
Almost every blog I read about HBase tells me it's a clone of
BigTable. Almost every blog I've read about HBase also tells me to
use a lot of RAM - gigabytes worth. Some even tell me not to even
consider HBase with less than 4GB of RAM.
If I remember my history correctly, a commodity machine in the year
2003 had around 512MB to 1GB of RAM in it. The fancier ones had, 2GB.
>From everything I've read, running HBase on such machines is a very
bad idea yet this was the machines readily available in the year 2003
when Google started it's growth.
I'm confused at the moment. Can someone give me a bit of background
about how HBase performance is handled from the "low" end which was
considered "high" end back then? Should I assume that HBase is just a
clone of BigTable? What is HBase's history? Are the blogs wrong?
Thanks for any clarification anyone can give.
Is HBase's configuration options robust enough that it could go back
and run well on those 2003 specs by a bit of tweaking if that what was