We use ec2 and cdh as well and have around 80 Hadoop/hbase nodes deployed across a few different clusters. We use a combination of puppet for package management and fabric scripts for pushing configs and managing services.
Our base AMI is a pretty bare centos6 install and puppet handles most of the rest after spinning up. Puppet also worked fine for managing configs, until we started having many clusters with different setups. That's the point we moved to fabric for that.
There is certainly an investment required for setting this stuff up initially, but it pays off as you continually need to spin up replacements or new nodes. We can do that with only a couple minutes of work at this point.
Sent from iPhone.
On Apr 26, 2012, at 1:12 AM, Something Something <[EMAIL PROTECTED]> wrote:
> We have a Hadoop cluster running on EC2 with Cloudera's
> hadoop-0.20.2-cdh3u2 distribution. We are now ready to install HBase on
> it. Trying to figure out what's the best way to accomplish this.
> We have quite a few machines in the cluster, so installing HBase on each
> machine would be time consuming. But if that's the only way, we can do it
> by creating our own RPMs. Is this document the best resource:
> Are there ec2 scripts that work with Cloudera's distribution to make this
> process easier?
> Please help. Thanks.