Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Hbase cluster for serving real time site traffic


Copy link to this message
-
Re: Hbase cluster for serving real time site traffic
Sorry I also forgot.  Do not run your NN and failover node with other
services.

On Tue, Oct 30, 2012 at 2:15 PM, Kevin O'dell <[EMAIL PROTECTED]>wrote:

> Varun,
>
>   I will take a shot at answering this:
>
> 1) It seems hbase starts only one zookeeper on the master node - which is
> critical for operation - how many zookeepers should I use and can I run
> those on the region servers ? <-- 3 and they should be on dedicated
> servers for a real production environment.
>
> 2) How many masters to use - does hbase support multiple masters (primary
> and secondary) within the same cluster ? From my understanding, master
> availability is not critical for operation. <--2 if you lose the master
> you lose HBase.  The Master is VERY critical.
>
> 3) NameNode - We are running hadoop 0.8 - I have read that NameNode is a
> single point of failure and we should really be running two name node(s) so
> we can failover. Is it fine to run these on the region servers ? 2, you
> will want to use HA for a real production workload.  The SNN(Secondary Name
> Node) is a very misleading name.
>
> So, yes, secondary NameNode is probably more critical than the secondary
> master - since the master is only responsible for metadata changes/region
> splits/table creation etc and not for writes/reads. <--- This is not
> correct.  The Secondary Name Node is not a failover node.  You will want to
> use a release that has HA to guarantee availability at the NN level.  The
> master is in charge of META data operations, but also with out the Master
> the RS will not continue to just work.  It is very important to have two
> masters.
>
>  I will defer Jean-Marc on the Schema designs.
>
>
>
> On Tue, Oct 30, 2012 at 1:03 PM, Varun Sharma <[EMAIL PROTECTED]> wrote:
>
>> Thanks for the tips.
>>
>> So, yes, secondary NameNode is probably more critical than the secondary
>> master - since the master is only responsible for metadata changes/region
>> splits/table creation etc and not for writes/reads.
>>
>> Regarding the keys question - i meant that the (row + column) length is
>> 24-32 bytes and the value length is 0-1 bytes. Currently, we have a
>> cluster
>> running with all the data loaded into hbase but it all runs with default
>> settings.
>>
>> Thanks
>> Varun
>>
>> On Tue, Oct 30, 2012 at 10:53 AM, Jean-Marc Spaggiari <
>> [EMAIL PROTECTED]> wrote:
>>
>> > My 2¢.
>> >
>> > 1) You need an odd number of ZooKeeper nodes. So 3 is the minimum
>> > recommanded for production.
>> > 2) Yes, you have Master and SecondaryMaster. And it's also recommanded
>> > to have one of each. And the master is critical. If you are loosing
>> > it, you are loosing your cluster.
>> > 3) NameNode is hadoop, not hbase. You should follow hadoop
>> > recommandations. Like you have secondarymaster, you have
>> > secondarynamenode. So I think you should have as many
>> > secondarynamenode as you have secondarymaster (on the same machine?).
>> > 4) I'm not sure to understanding this question. Key are binary. Array
>> > of bytes. So 32 0-1 bytes is a 3 bytes long array. It's not a lot.
>> > This will only give you 2^32 different rows. You will have to
>> > pre-split them, or you will end with almost all of them on the same
>> > regionserver?
>> >
>> > JM
>> >
>> > 2012/10/30, Varun Sharma <[EMAIL PROTECTED]>:
>> > > Hi,
>> > >
>> > > We are planning to experiment with a cluster for serving production
>> > traffic
>> > > using hbase for pinterest. We are starting off with a 10 region
>> server +
>> > 1
>> > > master cluster on Amazon EMR version 0.92. I had some very naive
>> > questions
>> > > (primarily around points of failure):
>> > >
>> > > 1) It seems hbase starts only one zookeeper on the master node -
>> which is
>> > > critical for operation - how many zookeepers should I use and can I
>> run
>> > > those on the region servers ?
>> > > 2) How many masters to use - does hbase support multiple masters
>> (primary
>> > > and secondary) within the same cluster ? From my understanding, master

Kevin O'Dell
Customer Operations Engineer, Cloudera