Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> hbase deployment using VMs for data nodes and SAN for data storage


Copy link to this message
-
Re: hbase deployment using VMs for data nodes and SAN for data storage
Lars,

I think we need to clarify what we think of as a  SAN.
Its possible to have a SAN where the disks appear as attached storage, while the traditional view is that the disks are detached.

There are some design considerations like cluster density where one would want to use a SAN like NetApp to effectively create a storage half to a cluster and then a compute half that requires a fraction of the space and energy of a commodity built cluster.

When we start to see clusters at PB scale, we have to consider the size of the footprint and the cost of operating them in terms of both energy efficiency and physical footprint in a data center.

HBase can run in such configurations with the right tuning.

I for one would love to have a data center where I can drop in different configurations and be able to tune and validate cluster designs, but alas that's something only a MapR, Cloudera, Hortonworks thing where they have the deep pockets and necessity to actually work through this for their customers.
On Oct 15, 2012, at 11:43 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> If you have a SAN, why would you want to use HBase?
>
> -- Lars
>
> ________________________________
> From: "Pamecha, Abhishek" <[EMAIL PROTECTED]>
> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Sent: Monday, October 15, 2012 3:00 PM
> Subject: hbase deployment using VMs for data nodes and SAN for data storage
>
> Hi
>
> We are deciding between using local disks for bare metal hosts Vs VMs using SAN for data storage. I was wondering if anyone has contrasted performance, availability and scalability between these two options?
>
> IMO, This is kinda similar to a typical  AWS or another cloud deployment.
>
> Thanks,
> Abhishek
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB