Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Replication questions


Copy link to this message
-
Re: Replication questions
Thanks Jun :)

--
Felix

On Thu, Apr 26, 2012 at 3:26 PM, Jun Rao <[EMAIL PROTECTED]> wrote:

> Some comments inlined below.
>
> Thanks,
>
> Jun
>
> On Thu, Apr 26, 2012 at 10:27 AM, Felix GV <[EMAIL PROTECTED]> wrote:
>
> > Cool :) Thanks for those insights :) !
> >
> > I changed the subject of the thread, in order not to derail the original
> > thread's subject...! I just want to recap to make sure I (and others)
> > understand all of this correctly :)
> >
> > So, if I understand correctly, with acks == [0,1] Kafka should provide a
> > latency that is similar to what we have now, but with the possibility of
> > losing a small window of unreplicated events in the case of an
> > unrecoverable hardware failure, and with acks > 1 (or acks == -1) there
> > will probably be a latency penalty but we will be completely protected
> from
> > (non-correlated) hardware failures, right?
> >
> > This is mostly true. The difference is that in 0.7, producer doesn't wait
> for a TCP response from broker. In 0.8, the producer always waits for a
> response from broker. How quickly the broker sends the response depends on
> acks. If acks is less than ideal, you may get the response faster, but have
> some risk of losing the data if there is broker failure.
>
>
> > Also, I guess the above assumptions are correct for a batch size of 1,
> and
> > that bigger batch sizes could also lead to small windows of unwritten
> data
> > in cases of failures, just like now...? Although, now that I think of
> it, I
> > guess the vulnerability of bigger batch sizes would, again, only come
> into
> > play in scenarios of unrecoverable correlated failures, since even if a
> > machine fails with some partially committed batch, there would be other
> > machines who received the same data (through replication) and would have
> > enough time to commit those batches...
> >
> > I want to add that if the producer itself dies, it could lose a batch of
> events.
>
>
> > Finally, I guess that replication (whatever the ack parameter is) will
> > affect the overall throughput capacity of the Kafka cluster, since every
> > node will now be writing its own data as well as the replicated data from
> > +/- 2 other nodes, right?
> >
> > --
> > Felix
> >
> >
> >
> > On Wed, Apr 25, 2012 at 6:32 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:
> >
> > > Short answer is yes, both async (acks=0 or 1) and sync replication
> > > (acks > 1) will be both be supported.
> > >
> > > -Jay
> > >
> > > On Wed, Apr 25, 2012 at 11:22 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
> > > > Felix,
> > > >
> > > > Initially, we thought we could keep the option of not sending acks
> from
> > > the
> > > > broker to the producer. However, this seems hard since in the new
> wire
> > > > protocol, we need to send at least the error code to the producer
> > (e.g.,
> > > a
> > > > request is sent to the wrong broker or wrong partition).
> > > >
> > > > So, what we allow in the current design is the following. The
> producer
> > > can
> > > > specify the # of acks in the request. By default (acks = -1), the
> > broker
> > > > will wait for the message to be written to all replicas that are
> still
> > > > synced up with the leader before acking the producer. Otherwise (acks
> > > >=0),
> > > > the broker will ack the producer after the message is written to acks
> > > > replicas. Currently, acks=0 is treated the same as acks=1.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Wed, Apr 25, 2012 at 10:39 AM, Felix GV <[EMAIL PROTECTED]>
> wrote:
> > > >
> > > >> Just curious, but if I remember correctly from the time I read
> > KAFKA-50
> > > and
> > > >> the related JIRA issues, you guys plan to implement sync AND async
> > > >> replication, right?
> > > >>
> > > >> --
> > > >> Felix
> > > >>
> > > >>
> > > >>
> > > >> On Tue, Apr 24, 2012 at 4:42 PM, Jay Kreps <[EMAIL PROTECTED]>
> > wrote:
> > > >>
> > > >> > Right now we do sloppy failover. That is when a broker goes down
> > > >> > traffic is redirected to the remaining machines, but any
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB