Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Hbase cluster for serving real time site traffic


+
Varun Sharma 2012-10-30, 17:41
Copy link to this message
-
Re: Hbase cluster for serving real time site traffic
My 2¢.

1) You need an odd number of ZooKeeper nodes. So 3 is the minimum
recommanded for production.
2) Yes, you have Master and SecondaryMaster. And it's also recommanded
to have one of each. And the master is critical. If you are loosing
it, you are loosing your cluster.
3) NameNode is hadoop, not hbase. You should follow hadoop
recommandations. Like you have secondarymaster, you have
secondarynamenode. So I think you should have as many
secondarynamenode as you have secondarymaster (on the same machine?).
4) I'm not sure to understanding this question. Key are binary. Array
of bytes. So 32 0-1 bytes is a 3 bytes long array. It's not a lot.
This will only give you 2^32 different rows. You will have to
pre-split them, or you will end with almost all of them on the same
regionserver?

JM

2012/10/30, Varun Sharma <[EMAIL PROTECTED]>:
> Hi,
>
> We are planning to experiment with a cluster for serving production traffic
> using hbase for pinterest. We are starting off with a 10 region server + 1
> master cluster on Amazon EMR version 0.92. I had some very naive questions
> (primarily around points of failure):
>
> 1) It seems hbase starts only one zookeeper on the master node - which is
> critical for operation - how many zookeepers should I use and can I run
> those on the region servers ?
> 2) How many masters to use - does hbase support multiple masters (primary
> and secondary) within the same cluster ? From my understanding, master
> availability is not critical for operation.
> 3) NameNode - We are running hadoop 0.8 - I have read that NameNode is a
> single point of failure and we should really be running two name node(s) so
> we can failover. Is it fine to run these on the region servers ?
> 4) Our current application involves long row/column - 24-32 bytes with 0-1
> bytes of values. Should we be using a different key encoding than the
> default encoding ? What advantages could it buy us ?
>
> We are currently using amazon EMR for testing purposes which runs hbase
> 0.92. If it works well, we would like to configure our own cluster with
> probably the latest version of hbase which appears to be 0.94 at the
> moment.
>
> Thanks
> Varun
>
+
Varun Sharma 2012-10-30, 18:03
+
Marcos Ortiz 2012-10-30, 20:20
+
Varun Sharma 2012-11-01, 08:01
+
Jeremy Carroll 2012-11-01, 16:31
+
Marcos Ortiz Valmaseda 2012-11-01, 11:17
+
Leonid Fedotov 2012-11-01, 17:09
+
Patrick Angeles 2012-11-01, 19:11
+
Patrick Angeles 2012-11-01, 19:20
+
Stack 2012-11-01, 18:59
+
Kevin Odell 2012-10-30, 19:15
+
Kevin Odell 2012-10-30, 19:16
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB