Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> Analysis of producer performance


+
Piotr Kozikowski 2013-04-08, 23:43
+
Jun Rao 2013-04-09, 04:49
+
Guy Doulberg 2013-04-09, 06:34
+
Piotr Kozikowski 2013-04-09, 17:23
+
Otis Gospodnetic 2013-04-10, 19:05
+
Piotr Kozikowski 2013-04-10, 20:11
+
Yiu Wing TSANG 2013-04-11, 02:47
+
Jun Rao 2013-04-11, 05:18
+
Piotr Kozikowski 2013-04-12, 00:46
+
Jun Rao 2013-04-12, 14:54
+
Piotr Kozikowski 2013-04-12, 23:09
+
Jun Rao 2013-04-15, 01:06
+
Philip OToole 2013-04-12, 15:22
Copy link to this message
-
Re: Analysis of producer performance -- and Producer-Kafka reliability
Interesting topic.

How would buffering in RAM help in reality though (just trying to work
through the scenerio in my head):

producer tries to connect to a broker, it fails, so it appends the message
to a in-memory store.  If the broker is down for say 20 minutes and then
comes back online, won't this create problems now when the producer creates
a new message, and it has 20 minutes of backlog, and the broker now is
handling more load (assuming you are sending those in-memory messages using
a different thread)?
On Fri, Apr 12, 2013 at 11:21 AM, Philip O'Toole <[EMAIL PROTECTED]> wrote:

> This is just my opinion of course (who else's could it be? :-)) but I think
> from an engineering point of view, one must spend one's time making the
> Producer-Kafka connection solid, if it is mission-critical.
>
> Kafka is all about getting messages to disk, and assuming your disks are
> solid (and 0.8 has replication) those messages are safe. To then try to
> build a system to cope with the Kafka brokers being unavailable seems like
> you're setting yourself for infinite regress. And to write code in the
> Producer to spool to disk seems even more pointless. If you're that
> worried, why not run a dedicated Kafka broker on the same node as the
> Producer, and connect over localhost? To turn around and write code to
> spool to disk, because the primary system that *spools to disk* is down
> seems to be missing the point.
>
> That said, even by going over local-host, I guess the network connection
> could go down. In that case, Producers should buffer in RAM, and start
> sending some major alerts to the Operations team. But this should almost
> *never happen*. If it is happening regularly *something is fundamentally
> wrong with your system design*. Those Producers should also refuse any more
> incoming traffic and await intervention. Even bringing up "netcat -l" and
> letting it suck in the data and write it to disk would work then.
> Alternatives include having Producers connect to a load-balancer with
> multiple Kafka brokers behind it, which helps you deal with any one Kafka
> broker failing. Or just have your Producers connect directly to multiple
> Kafka brokers, and switch over as needed if any one broker goes down.
>
> I don't know if the standard Kafka producer that ships with Kafka supports
> buffering in RAM in an emergency. We wrote our own that does, with a focus
> on speed and simplicity, but I expect it will very rarely, if ever, buffer
> in RAM.
>
> Building and using semi-reliable system after semi-reliable system, and
> chaining them all together, hoping to be more tolerant of failure is not
> necessarily a good approach. Instead, identifying that one system that is
> critical, and ensuring that it remains up (redundant installations,
> redundant disks, redundant network connections etc) is a better approach
> IMHO.
>
> Philip
>
>
> On Fri, Apr 12, 2013 at 7:54 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
>
> > Another way to handle this is to provision enough client and broker
> servers
> > so that the peak load can be handled without spooling.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Thu, Apr 11, 2013 at 5:45 PM, Piotr Kozikowski <[EMAIL PROTECTED]
> > >wrote:
> >
> > > Jun,
> > >
> > > When talking about "catastrophic consequences" I was actually only
> > > referring to the producer side. in our use case (logging requests from
> > > webapp servers), a spike in traffic would force us to either tolerate a
> > > dramatic increase in the response time, or drop messages, both of which
> > are
> > > really undesirable. Hence the need to absorb spikes with some system on
> > top
> > > of Kafka, unless the spooling feature mentioned by Wing (
> > > https://issues.apache.org/jira/browse/KAFKA-156) is implemented. This
> is
> > > assuming there are a lot more producer machines than broker nodes, so
> > each
> > > producer would absorb a small part of the extra load from the spike.
> > >
> > > Piotr
> > >
> > > On Wed, Apr 10, 2013 at 10:17 PM, Jun Rao <[EMAIL PROTECTED]> wrote:

 
+
Philip OToole 2013-04-12, 15:59
+
Philip OToole 2013-04-12, 17:04
+
Piotr Kozikowski 2013-04-15, 18:19
+
David Arthur 2013-04-23, 12:22