Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Kafka mirroring and zookeeper


Copy link to this message
-
Re: Kafka mirroring and zookeeper
Just curious, but if I remember correctly from the time I read KAFKA-50 and
the related JIRA issues, you guys plan to implement sync AND async
replication, right?

--
Felix

On Tue, Apr 24, 2012 at 4:42 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:

> Right now we do sloppy failover. That is when a broker goes down
> traffic is redirected to the remaining machines, but any unconsumed
> messages are stuck on that server until it comes back, if it is
> permanently gone the messages are lost. This is acceptable for us in
> the near-term since our pipeline is pretty real-time so this window
> between production and consumption is pretty small. The complete
> solution is the intra-cluster replication in KAFA-50 which is coming
> along fairly nicely now that we are working on it.
>
> -Jay
>
> On Tue, Apr 24, 2012 at 12:21 PM, Oliver Krohne
> <[EMAIL PROTECTED]> wrote:
> > Hi,
> >
> > indeed I thought could be used as failover approach.
> >
> > We use raid for local redundancy but it does not protect us in case of a
> machine failure, so I am looking for a way to achieve a master/slave setup
> until KAFKA-50 has been implemented.
> >
> > I think we can solve it for now by having multiple broker so that the
> application can continue sending messages if one broker goes down. My main
> concern is to not introduce a new single point of failure which can stop
> the application. However as some consumer are not developed by us and it is
> not clear how they store the offset in zookeeper we need to find out how we
> can manage the consumer in case a broker will never return after a failure.
> It will be acceptable to lose a couple of messages if a broker dies and the
> consumers have not consumed all messages at the point of failure.
> >
> > Thanks,
> > Oliver
> >
> >
> >
> >
> > Am 23.04.2012 um 19:58 schrieb Jay Kreps:
> >
> >> I think the confusion comes from the fact that we are using mirroring to
> >> handle geographic distribution not failover. If I understand correctly
> what
> >> Oliver is asking for is something to give fault tolerance not something
> for
> >> distribution. I don't think that is really what the mirroring does out
> of
> >> the box, though technically i suppose you could just reset the offsets
> and
> >> point the consumer at the new cluster and have it start from "now".
> >>
> >> I think it would be helpful to document our use case in the mirroring
> docs
> >> since this is not the first time someone has asked about this.
> >>
> >> -Jay
> >>
> >> On Mon, Apr 23, 2012 at 10:38 AM, Joel Koshy <[EMAIL PROTECTED]>
> wrote:
> >>
> >>> Hi Oliver,
> >>>
> >>> I was reading the mirroring guide and I wonder if it is required that
> the
> >>>> mirror runs it's own zookeeper?
> >>>>
> >>>> We have a zookeeper cluster running which is used by different
> >>>> applications, so can we use that zookeeper cluster for the kafka
> source
> >>> and
> >>>> kafka mirror?
> >>>>
> >>>
> >>> You could have a single zookeeper cluster and use different namespaces
> for
> >>> the source/target mirror. However, I don't think it is recommended to
> use a
> >>> remote zookeeper (if you have a cross-DC set up) since that would
> >>> potentially mean very high ZK latencies on one of your clusters.
> >>>
> >>>
> >>>> What is the procedure if the kafka source server fails to switch the
> >>>> applications to use the mirrored instance?
> >>>>
> >>>
> >>> I don't quite follow this question - can you clarify? The mirror
> cluster is
> >>> pretty much a separate instance. There is no built-in automatic
> fail-over
> >>> if your source cluster goes down.
> >>>
> >>>
> >>>> Are there any backup best practices if we would not use mirroring?
> >>>>
> >>>
> >>> You can use RAID arrays for (local) data redundancy. You may also be
> >>> interested in the (intra-DC) replication feature (KAFKA-50) that is
> >>> currently being developed. I believe some folks on this list have also
> used
> >>> plain rsync's as an alternative to mirroring.
> >>>
>