Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Kafka/ZK Cluster Example


Copy link to this message
-
Re: Kafka/ZK Cluster Example
As I understand it, you cannot use a mirrored Kafka cluster as a hot
fail-over.

You could probably use it as a manual fail-over, but I don't know the
complexity involved in doing that.

Also, if your source cluster fails while producers were putting data into
it, there will be an "unconsumed window" of data that is lost. This
corresponds to the data that the embedded consumer in the mirrored cluster
did not have time to consume from the source cluster.

All in all, the mirrored cluster is akin to asynchronous replication,
without any hot fail-over capability. Thus, it provides data redundancy
(outside of the unconsumed window described above) but no extra
availability (unless you count manual interventions).

KAFKA-50 <https://issues.apache.org/jira/browse/KAFKA-50>, on the other
hand, will provide both asynchronous AND synchronous replication (although
the latter will incur a latency penalty) and will be able to use the
replicas (data redundancy) as hot-fail overs.

Depending on your personal definition of "highly reliable" (whether it
includes data redundancy and/or availability), I think that should probably
answer your question...?

To all the Kafka experts: please correct me if the above explanations are
incorrect :) !

--
Felix

On Wed, Jan 11, 2012 at 5:53 PM, Jun Rao <[EMAIL PROTECTED]> wrote:

> It's just that the mirroring logic depends on ZK to be available most of
> the time.
>
> Jun
>
> On Wed, Jan 11, 2012 at 2:35 PM, Christian Carollo <[EMAIL PROTECTED]
> >wrote:
>
> > I see.  But if I used that configuration and then did the mirroring you
> > suggested would that be enough, in your opinion, to be considered highly
> > reliable?
> >
> > Christian
> >
> >
> > On Jan 11, 2012, at 2:32 PM, Jun Rao wrote:
> >
> > >> For example, can I have one ZK instance and one broker on one machine
> > and
> > > that is enough to define a ZK cluster and a Kafka Cluster?
> > >
> > > Yes, although you don't get the reliability of ZK now.
> > >
> > > Jun
> > >
> > >
> > > On Wed, Jan 11, 2012 at 2:06 PM, Christian Carollo <[EMAIL PROTECTED]
> > >wrote:
> > >
> > >> Jun,
> > >>
> > >> I don't think I ask my question the right way.
> > >>
> > >> What I am trying to understand is what are the minimum constituent
> parts
> > >> of a kafka cluster?
> > >>
> > >> Based on your last email, I am now wondering what are the minimum
> > >> constituent parts of a ZK cluster as well as a Kafka cluster?
> > >>
> > >> For example, can I have one ZK instance and one broker on one machine
> > and
> > >> that is enough to define a ZK cluster and a Kafka Cluster?
> > >>
> > >> Thanks,
> > >> Christian
> > >>
> > >>
> > >> On Jan 11, 2012, at 1:50 PM, Jun Rao <[EMAIL PROTECTED]> wrote:
> > >>
> > >>> Chrsitan,
> > >>>
> > >>> A Kafka cluster containers a ZK cluster and a list of brokers. When a
> > >>> consumer subscribes to a topic in a kafka cluster, it consumes data
> > >> stored
> > >>> in all brokers in that cluster.
> > >>>
> > >>> Thanks,
> > >>>
> > >>> Jun
> > >>>
> > >>> On Tue, Jan 10, 2012 at 11:28 PM, Christian Carollo <
> > [EMAIL PROTECTED]
> > >>> wrote:
> > >>>
> > >>>> Thank you Jun that is quite helpful.  I have a question about Kafka
> > >>>> Clusters.  What are the minimum number and types of services that
> must
> > >> be
> > >>>> running to make up a Kafka Cluster?
> > >>>>
> > >>>> I ask this because the diagrams (in the Kafka Mirroring document)
> > allude
> > >>>> to a multiple broker environment, however, since each broker does
> not
> > >>>> appear to provide redundancy (as of today) to any of the other
> brokers
> > >> in a
> > >>>> given zookeeper service, it seems like a Kafka Cluster is nothing
> more
> > >> than
> > >>>> a grouping of a single zookeeper instance with a single Kafka
> broker,
> > is
> > >>>> this the correct understanding?
> > >>>>
> > >>>> Thanks,
> > >>>> Christian
> > >>>>
> > >>>> On Jan 10, 2012, at 8:47 AM, Jun Rao wrote:
> > >>>>
> > >>>>> With 0.7, you can set up inter-cluster replication (
> > >>>>> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+mirroring
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB