Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> HA / failover


Copy link to this message
-
Re: HA / failover
Sorry Jun that it took me so long to reply.
There's still one thing I don't get:

>> There is one offset per topic/partition, if a partition is not available
because a broker is down, its offset in the consumer won't grow anymore.

So, because I want HA, I set up 2 brokers to attend same topic/partition, right?

using zk-producer  msgs will be sent only to one of those 2 brokers? Or will it balance randomly?

If one of those 2 brokers is down, producer will start sending messages to the one alive?
 
Example:
Start zk x 3, kafka x 2 (first run), 1 zk-producer, 1 zk-consumer

Produce msgs 1 & 2
Consume msg 1
kafka A fails -> consumer now reads kafka B
Produce msgs 3 & 4
Consume msgs 3,4
Kafka A is started
Consumer sees it, but won't ask for msg 2

Makes sense?

PS: I'm trying to understand how linkedin manages HA with sensei + kafka...sorry!
----- Mensaje original -----
De: Jun Rao [mailto:[EMAIL PROTECTED]]
Enviado: Tuesday, August 30, 2011 03:46 PM
Para: [EMAIL PROTECTED] <[EMAIL PROTECTED]>
Asunto: Re: HA / failover

See my inlined reply below.

Thanks,

Jun
On Tue, Aug 30, 2011 at 8:36 AM, Roman Garcia <[EMAIL PROTECTED]> wrote:

> >> Roman,
> Without replication, Kafka can lose messages permanently if the
> underlying storage system is damaged. Setting that aside, there are 2
> ways that you can achieve HA now. In either case, you need to set up a
> Kafka cluster with at least 2 brokers.
>
> Thanks for the clarification Jun. But even then, with replication, you
> could still lose messages, right?
>
>
If you do synchronous replication with replication factor >1 and there is
only 1 failure, you won't lose any messages.
> >> [...] Unconsumed messages on that broker will not be available for
> consumption until the broker comes up again.
>
> How does a Consumer fetch those "old" messages, given that it did
> already fetch "new" messages at a higher offset? What am I missing?
>

There is one offset per topic/partition, if a partition is not available
because a broker is down, its offset in the consumer won't grow anymore.
>
> >> The second approach is to use the built-in ZK-based software load
> balancer in Kafka (by setting zk.connect in the producer config). In
> this case, we rely on ZK to detect broker failures.
>
> This is the approach I've tried. I did use zj.connect.
> I started all locally:
> - 2 Kafka brokers (broker id=0 & 1, single partition)
> - 3 zookeeper nodes (all of these on a single box) with different
> election ports and different fs paths/ids.
> - 5 producer threads sending <1k msgs
>
> Then I killed one of the Kafka brokers, and all my producer threads
> died.
>
>
That could be a bug. Are you using trunk? Any errors/exceptions in the log?
> What I'm I doing wrong?
>
>
> Thanks!
> Roman
>
>
> -----Original Message-----
> From: Jun Rao [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, August 30, 2011 11:44 AM
> To: [EMAIL PROTECTED]
> Subject: Re: HA / failover
>
> Roman,
>
> Without replication, Kafka can lose messages permanently if the
> underlying storage system is damaged. Setting that aside, there are 2
> ways that you can achieve HA now. In either case, you need to set up a
> Kafka cluster with at least 2 brokers.
>
> The first approach is to put the hosts of all Kafka brokers in a VIP and
> rely on a hardware load balancer to do health check and routing. In the
> case, all producers send data through the VIP. If one of the brokers is
> down temporarily, the load balancer will direct the produce requests to
> the rest of the brokers. Unconsumed messages on that broker will not be
> available for consumption until the broker comes up again.
>
>  The second approach is to use the built-in ZK-based software load
> balancer in Kafka (by setting zk.connect in the producer config). In
> this case, we rely on ZK to detect broker failures.
>
> Thanks,
>
> Jun
>
> On Tue, Aug 30, 2011 at 7:18 AM, Roman Garcia <[EMAIL PROTECTED]>
> wrote:
>
> > Hi, I'm trying to figure out how my prod environment should look like,