If you want to serve some application off hbase, you might be better
off with a separate cluster so you don't mix workloads with the MR
What kind of graph db are you looking to build? There is work being
done on that front and we would like to know about your use case...
On 4/25/10, Aaron McCurry <[EMAIL PROTECTED]> wrote:
> I have been fan of hbase for awhile, but until now I haven't had any extra
> hardware to setup and run an instance. Now I'm trying to decide what would
> be the most ideal setup.
> I have a 64 node hadoop/hive setup, each node has dual quad core processors
> with 32 Gig of ram and 4 T of storage. Now my options are, to run a 64 way
> hbase setup on those nodes, or possible run hbase on a separate set of
> machines up to 16 nodes of the same type, but they would only be used for
> hbase. I'm leaning toward running hbase on the 64 way cluster with hadoop,
> because I'm going to be using hbase in some map reduce jobs and for the
> What I'm planning on doing with the cluster:
> - Migrate some large berkeley dbs to hbase (15 - 20 billion records)
> - Mix some live data from hbase with some batch processing in hive (small
> amount of data)
> - Build a large graph db on top of hbase (size unknown, billions at
> - Probably a lot more things as time goes along
> Thoughts and opinions welcome. Thanks!
Computer Science Graduate Student
University of California, Santa Cruz