|
Vaibhav Puranik
2012-06-26, 01:01
Jun Rao
2012-06-26, 14:55
Niek Sanders
2012-06-26, 15:52
Vaibhav Puranik
2012-06-26, 16:27
Vaibhav Puranik
2012-06-26, 23:46
Jun Rao
2012-06-26, 23:52
Vaibhav Puranik
2012-06-27, 21:43
Chris Burroughs
2012-06-27, 21:48
Neha Narkhede
2012-06-27, 21:48
Jun Rao
2012-06-27, 21:56
Vaibhav Puranik
2012-06-27, 22:03
Neha Narkhede
2012-06-27, 22:13
Vaibhav Puranik
2012-06-27, 22:18
Niek Sanders
2012-06-27, 22:24
Vaibhav Puranik
2012-06-27, 22:31
Joel Koshy
2012-06-27, 22:42
Vaibhav Puranik
2012-06-28, 17:44
Jun Rao
2012-06-28, 18:15
Vaibhav Puranik
2012-06-28, 20:36
Joel Koshy
2012-06-28, 21:37
Vaibhav Puranik
2012-06-29, 00:40
Jun Rao
2012-06-29, 01:40
|
-
Getting timeouts with elastic load balancer in AWSVaibhav Puranik 2012-06-26, 01:01
Hi all,
We are sending our ad impressions to Kafka 0.7.0. I am using async prouducers in our web app. I am pooling kafak producers with commons pool. Pool size - 10. batch.size is 100. We have 3 c1.xlarge instances with Kafka brokers installed behind a elastic load balancer in AWS. Every minute we loose some events because of the following exception - Disconnecting from dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 - Error in handling batch of 64 events java.io.IOException: Connection timed out at sun.nio.ch.FileDispatcher.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104) at sun.nio.ch.IOUtil.write(IOUtil.java:75) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334) at kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51) at kafka.network.Send$class.writeCompletely(Transmission.scala:76) at kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25) at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88) at kafka.producer.SyncProducer.send(SyncProducer.scala:87) at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128) at kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52) at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46) at kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119) at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98) at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74) at scala.collection.immutable.Stream.foreach(Stream.scala:254) at kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73) at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43) - Connected to dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 for producing Has anybody faced this kind of timeouts before? Do they indicate any resource misconfiguration? The CPU usage on broker is pretty small. Also, in spite of setting batch size to 100, the failing batch usually only have 50 to 60 events. Is there any other limit I am hitting? Any help is appreciated. Regards, Vaibhav GumGum
-
Re: Getting timeouts with elastic load balancer in AWSJun Rao 2012-06-26, 14:55
Vaibhav,
Does elastic load balancer have any timeouts or quotas that kill existing socket connections? Does client resend succeed (you can configure resend in DefaultEventHandler)? Thanks, Jun On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote: > Hi all, > > We are sending our ad impressions to Kafka 0.7.0. I am using async > prouducers in our web app. > I am pooling kafak producers with commons pool. Pool size - 10. batch.size > is 100. > > We have 3 c1.xlarge instances with Kafka brokers installed behind a elastic > load balancer in AWS. > Every minute we loose some events because of the following exception > > - Disconnecting from dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 > - Error in handling batch of 64 events > java.io.IOException: Connection timed out > at sun.nio.ch.FileDispatcher.write0(Native Method) > at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29) > at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104) > at sun.nio.ch.IOUtil.write(IOUtil.java:75) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334) > at > kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51) > at kafka.network.Send$class.writeCompletely(Transmission.scala:76) > at > > kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25) > at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88) > at kafka.producer.SyncProducer.send(SyncProducer.scala:87) > at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128) > at > kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52) > at > > kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46) > at > > kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119) > at > > kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98) > at > > kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74) > at scala.collection.immutable.Stream.foreach(Stream.scala:254) > at > > kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73) > at > kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43) > - Connected to dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 for > producing > > Has anybody faced this kind of timeouts before? Do they indicate any > resource misconfiguration? The CPU usage on broker is pretty small. > Also, in spite of setting batch size to 100, the failing batch usually only > have 50 to 60 events. Is there any other limit I am hitting? > > Any help is appreciated. > > > Regards, > Vaibhav > GumGum >
-
Re: Getting timeouts with elastic load balancer in AWSNiek Sanders 2012-06-26, 15:52
ELBs will close connections that have no data going across them over a
60 sec period. A reference to this behavior can be found at the bottom of this page: http://aws.amazon.com/articles/1636185810492479 There is currently no way for customers to increase this timeout. If this timeout is in fact the problem, then the alternative is to use HA proxy for load balancing instead. - Niek On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <[EMAIL PROTECTED]> wrote: > Vaibhav, > > Does elastic load balancer have any timeouts or quotas that kill existing > socket connections? Does client resend succeed (you can configure resend in > DefaultEventHandler)? > > Thanks, > > Jun > > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote: > >> Hi all, >> >> We are sending our ad impressions to Kafka 0.7.0. I am using async >> prouducers in our web app. >> I am pooling kafak producers with commons pool. Pool size - 10. batch.size >> is 100. >> >> We have 3 c1.xlarge instances with Kafka brokers installed behind a elastic >> load balancer in AWS. >> Every minute we loose some events because of the following exception >> >> - Disconnecting from dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 >> - Error in handling batch of 64 events >> java.io.IOException: Connection timed out >> at sun.nio.ch.FileDispatcher.write0(Native Method) >> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29) >> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104) >> at sun.nio.ch.IOUtil.write(IOUtil.java:75) >> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334) >> at >> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51) >> at kafka.network.Send$class.writeCompletely(Transmission.scala:76) >> at >> >> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25) >> at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88) >> at kafka.producer.SyncProducer.send(SyncProducer.scala:87) >> at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128) >> at >> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52) >> at >> >> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46) >> at >> >> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119) >> at >> >> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98) >> at >> >> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74) >> at scala.collection.immutable.Stream.foreach(Stream.scala:254) >> at >> >> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73) >> at >> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43) >> - Connected to dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 for >> producing >> >> Has anybody faced this kind of timeouts before? Do they indicate any >> resource misconfiguration? The CPU usage on broker is pretty small. >> Also, in spite of setting batch size to 100, the failing batch usually only >> have 50 to 60 events. Is there any other limit I am hitting? >> >> Any help is appreciated. >> >> >> Regards, >> Vaibhav >> GumGum >>
-
Re: Getting timeouts with elastic load balancer in AWSVaibhav Puranik 2012-06-26, 16:27
These are great pointers.
I found some more discussion here: https://forums.aws.amazon.com/thread.jspa?threadID=33427 I can do the following to keep using the elastic load balancer: 1) Reduce the producer pool size to 1 or 2 because looks like connections are sitting idle. My volume does not desire that big pool. 2) Reduce the batch size so that the webapp frequently dumps the data to brokers. It's better for us anyways. I will try both of these options and report back. Thank you very much Jun and Niek. Regards, Vaibhav On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <[EMAIL PROTECTED]>wrote: > ELBs will close connections that have no data going across them over a > 60 sec period. A reference to this behavior can be found at the > bottom of this page: > > http://aws.amazon.com/articles/1636185810492479 > > There is currently no way for customers to increase this timeout. If > this timeout is in fact the problem, then the alternative is to use HA > proxy for load balancing instead. > > - Niek > > > > > On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <[EMAIL PROTECTED]> wrote: > > Vaibhav, > > > > Does elastic load balancer have any timeouts or quotas that kill existing > > socket connections? Does client resend succeed (you can configure resend > in > > DefaultEventHandler)? > > > > Thanks, > > > > Jun > > > > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <[EMAIL PROTECTED]> > wrote: > > > >> Hi all, > >> > >> We are sending our ad impressions to Kafka 0.7.0. I am using async > >> prouducers in our web app. > >> I am pooling kafak producers with commons pool. Pool size - 10. > batch.size > >> is 100. > >> > >> We have 3 c1.xlarge instances with Kafka brokers installed behind a > elastic > >> load balancer in AWS. > >> Every minute we loose some events because of the following exception > >> > >> - Disconnecting from > dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 > >> - Error in handling batch of 64 events > >> java.io.IOException: Connection timed out > >> at sun.nio.ch.FileDispatcher.write0(Native Method) > >> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29) > >> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104) > >> at sun.nio.ch.IOUtil.write(IOUtil.java:75) > >> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334) > >> at > >> > kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51) > >> at kafka.network.Send$class.writeCompletely(Transmission.scala:76) > >> at > >> > >> > kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25) > >> at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88) > >> at kafka.producer.SyncProducer.send(SyncProducer.scala:87) > >> at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128) > >> at > >> > kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52) > >> at > >> > >> > kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46) > >> at > >> > >> > kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119) > >> at > >> > >> > kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98) > >> at > >> > >> > kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74) > >> at scala.collection.immutable.Stream.foreach(Stream.scala:254) > >> at > >> > >> > kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73) > >> at > >> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43) > >> - Connected to dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 for > >> producing > >> > >> Has anybody faced this kind of timeouts before? Do they indicate any > >> resource misconfiguration? The CPU usage on broker is pretty small. > >> Also, in spite of setting batch size to 100, the failing batch usually > only > >> have 50 to 60 events. Is there any other limit I am hitting?
-
Re: Getting timeouts with elastic load balancer in AWSVaibhav Puranik 2012-06-26, 23:46
I reduced the batch size and reduced the pooled connections. Number of
errors have gone down significantly. But they are not eliminated yet. We definitely don't want to loose any events. Jun, how do I configure the client resend you mentioned below? I couldn't find any configuration. Regards, Vaibhav On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote: > These are great pointers. > I found some more discussion here: > https://forums.aws.amazon.com/thread.jspa?threadID=33427 > > I can do the following to keep using the elastic load balancer: > > 1) Reduce the producer pool size to 1 or 2 because looks like connections > are sitting idle. My volume does not desire that big pool. > 2) Reduce the batch size so that the webapp frequently dumps the data to > brokers. It's better for us anyways. > > I will try both of these options and report back. > > Thank you very much Jun and Niek. > > Regards, > Vaibhav > > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <[EMAIL PROTECTED]>wrote: > >> ELBs will close connections that have no data going across them over a >> 60 sec period. A reference to this behavior can be found at the >> bottom of this page: >> >> http://aws.amazon.com/articles/1636185810492479 >> >> There is currently no way for customers to increase this timeout. If >> this timeout is in fact the problem, then the alternative is to use HA >> proxy for load balancing instead. >> >> - Niek >> >> >> >> >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <[EMAIL PROTECTED]> wrote: >> > Vaibhav, >> > >> > Does elastic load balancer have any timeouts or quotas that kill >> existing >> > socket connections? Does client resend succeed (you can configure >> resend in >> > DefaultEventHandler)? >> > >> > Thanks, >> > >> > Jun >> > >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <[EMAIL PROTECTED]> >> wrote: >> > >> >> Hi all, >> >> >> >> We are sending our ad impressions to Kafka 0.7.0. I am using async >> >> prouducers in our web app. >> >> I am pooling kafak producers with commons pool. Pool size - 10. >> batch.size >> >> is 100. >> >> >> >> We have 3 c1.xlarge instances with Kafka brokers installed behind a >> elastic >> >> load balancer in AWS. >> >> Every minute we loose some events because of the following exception >> >> >> >> - Disconnecting from >> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 >> >> - Error in handling batch of 64 events >> >> java.io.IOException: Connection timed out >> >> at sun.nio.ch.FileDispatcher.write0(Native Method) >> >> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29) >> >> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104) >> >> at sun.nio.ch.IOUtil.write(IOUtil.java:75) >> >> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334) >> >> at >> >> >> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51) >> >> at kafka.network.Send$class.writeCompletely(Transmission.scala:76) >> >> at >> >> >> >> >> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25) >> >> at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88) >> >> at kafka.producer.SyncProducer.send(SyncProducer.scala:87) >> >> at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128) >> >> at >> >> >> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52) >> >> at >> >> >> >> >> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46) >> >> at >> >> >> >> >> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119) >> >> at >> >> >> >> >> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98) >> >> at >> >> >> >> >> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74) >> >> at scala.collection.immutable.Stream.foreach(Stream.scala:254) >> >> at >> >> >> >> >> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73)
-
Re: Getting timeouts with elastic load balancer in AWSJun Rao 2012-06-26, 23:52
Set num.retries in producer config property file. It defaults to 0.
Thanks, Jun On Tue, Jun 26, 2012 at 4:46 PM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote: > I reduced the batch size and reduced the pooled connections. Number of > errors have gone down significantly. But they are not eliminated yet. > > We definitely don't want to loose any events. > > Jun, how do I configure the client resend you mentioned below? I couldn't > find any configuration. > > Regards, > Vaibhav > > On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <[EMAIL PROTECTED]> > wrote: > > > These are great pointers. > > I found some more discussion here: > > https://forums.aws.amazon.com/thread.jspa?threadID=33427 > > > > I can do the following to keep using the elastic load balancer: > > > > 1) Reduce the producer pool size to 1 or 2 because looks like connections > > are sitting idle. My volume does not desire that big pool. > > 2) Reduce the batch size so that the webapp frequently dumps the data to > > brokers. It's better for us anyways. > > > > I will try both of these options and report back. > > > > Thank you very much Jun and Niek. > > > > Regards, > > Vaibhav > > > > > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <[EMAIL PROTECTED] > >wrote: > > > >> ELBs will close connections that have no data going across them over a > >> 60 sec period. A reference to this behavior can be found at the > >> bottom of this page: > >> > >> http://aws.amazon.com/articles/1636185810492479 > >> > >> There is currently no way for customers to increase this timeout. If > >> this timeout is in fact the problem, then the alternative is to use HA > >> proxy for load balancing instead. > >> > >> - Niek > >> > >> > >> > >> > >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <[EMAIL PROTECTED]> wrote: > >> > Vaibhav, > >> > > >> > Does elastic load balancer have any timeouts or quotas that kill > >> existing > >> > socket connections? Does client resend succeed (you can configure > >> resend in > >> > DefaultEventHandler)? > >> > > >> > Thanks, > >> > > >> > Jun > >> > > >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <[EMAIL PROTECTED]> > >> wrote: > >> > > >> >> Hi all, > >> >> > >> >> We are sending our ad impressions to Kafka 0.7.0. I am using async > >> >> prouducers in our web app. > >> >> I am pooling kafak producers with commons pool. Pool size - 10. > >> batch.size > >> >> is 100. > >> >> > >> >> We have 3 c1.xlarge instances with Kafka brokers installed behind a > >> elastic > >> >> load balancer in AWS. > >> >> Every minute we loose some events because of the following exception > >> >> > >> >> - Disconnecting from > >> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 > >> >> - Error in handling batch of 64 events > >> >> java.io.IOException: Connection timed out > >> >> at sun.nio.ch.FileDispatcher.write0(Native Method) > >> >> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29) > >> >> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104) > >> >> at sun.nio.ch.IOUtil.write(IOUtil.java:75) > >> >> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334) > >> >> at > >> >> > >> > kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51) > >> >> at kafka.network.Send$class.writeCompletely(Transmission.scala:76) > >> >> at > >> >> > >> >> > >> > kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25) > >> >> at > kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88) > >> >> at kafka.producer.SyncProducer.send(SyncProducer.scala:87) > >> >> at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128) > >> >> at > >> >> > >> > kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52) > >> >> at > >> >> > >> >> > >> > kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46) > >> >> at > >> >> > >> >> > >> > kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119)
-
Re: Getting timeouts with elastic load balancer in AWSVaibhav Puranik 2012-06-27, 21:43
Jun,
I wrote a test producer to test if num.retries working or not. But I found that it's not working. No matter how many retries I set, whenever a message send fails, it always never gets to the broker. I am using Kafka 0.7.0 Is this a known problem? Do I need to file a JIRA issue? Because we are using Async producer we have no way to catch the exception ourselves and act on it. Is that right? Any ideas how we can ensure that every single message is sent with retries? Regards, Vaibhav On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <[EMAIL PROTECTED]> wrote: > Set num.retries in producer config property file. It defaults to 0. > > Thanks, > > Jun > > On Tue, Jun 26, 2012 at 4:46 PM, Vaibhav Puranik <[EMAIL PROTECTED]> > wrote: > > > I reduced the batch size and reduced the pooled connections. Number of > > errors have gone down significantly. But they are not eliminated yet. > > > > We definitely don't want to loose any events. > > > > Jun, how do I configure the client resend you mentioned below? I couldn't > > find any configuration. > > > > Regards, > > Vaibhav > > > > On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <[EMAIL PROTECTED]> > > wrote: > > > > > These are great pointers. > > > I found some more discussion here: > > > https://forums.aws.amazon.com/thread.jspa?threadID=33427 > > > > > > I can do the following to keep using the elastic load balancer: > > > > > > 1) Reduce the producer pool size to 1 or 2 because looks like > connections > > > are sitting idle. My volume does not desire that big pool. > > > 2) Reduce the batch size so that the webapp frequently dumps the data > to > > > brokers. It's better for us anyways. > > > > > > I will try both of these options and report back. > > > > > > Thank you very much Jun and Niek. > > > > > > Regards, > > > Vaibhav > > > > > > > > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <[EMAIL PROTECTED] > > >wrote: > > > > > >> ELBs will close connections that have no data going across them over a > > >> 60 sec period. A reference to this behavior can be found at the > > >> bottom of this page: > > >> > > >> http://aws.amazon.com/articles/1636185810492479 > > >> > > >> There is currently no way for customers to increase this timeout. If > > >> this timeout is in fact the problem, then the alternative is to use HA > > >> proxy for load balancing instead. > > >> > > >> - Niek > > >> > > >> > > >> > > >> > > >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <[EMAIL PROTECTED]> wrote: > > >> > Vaibhav, > > >> > > > >> > Does elastic load balancer have any timeouts or quotas that kill > > >> existing > > >> > socket connections? Does client resend succeed (you can configure > > >> resend in > > >> > DefaultEventHandler)? > > >> > > > >> > Thanks, > > >> > > > >> > Jun > > >> > > > >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik < > [EMAIL PROTECTED]> > > >> wrote: > > >> > > > >> >> Hi all, > > >> >> > > >> >> We are sending our ad impressions to Kafka 0.7.0. I am using async > > >> >> prouducers in our web app. > > >> >> I am pooling kafak producers with commons pool. Pool size - 10. > > >> batch.size > > >> >> is 100. > > >> >> > > >> >> We have 3 c1.xlarge instances with Kafka brokers installed behind a > > >> elastic > > >> >> load balancer in AWS. > > >> >> Every minute we loose some events because of the following > exception > > >> >> > > >> >> - Disconnecting from > > >> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 > > >> >> - Error in handling batch of 64 events > > >> >> java.io.IOException: Connection timed out > > >> >> at sun.nio.ch.FileDispatcher.write0(Native Method) > > >> >> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29) > > >> >> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104) > > >> >> at sun.nio.ch.IOUtil.write(IOUtil.java:75) > > >> >> at > sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334) > > >> >> at > > >> >> > > >> > > > kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51)
-
Re: Getting timeouts with elastic load balancer in AWSChris Burroughs 2012-06-27, 21:48
On 06/27/2012 05:43 PM, Vaibhav Puranik wrote:
> Is this a known problem? Do I need to file a JIRA issue? Thanks! Please do.
-
Re: Getting timeouts with elastic load balancer in AWSNeha Narkhede 2012-06-27, 21:48
Vaibhav,
>> No matter how many retries I set, whenever a message send fails, it always never gets to the broker. Please can you send across the error message that you see on the producer side ? Thanks, Neha On Wed, Jun 27, 2012 at 2:43 PM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote: > Jun, > > I wrote a test producer to test if num.retries working or not. But I found > that it's not working. No matter how many retries I set, whenever a > message send fails, it always never gets to the broker. > I am using Kafka 0.7.0 > > Is this a known problem? Do I need to file a JIRA issue? > > Because we are using Async producer we have no way to catch the exception > ourselves and act on it. Is that right? Any ideas how we can ensure that > every single message is sent with retries? > > Regards, > Vaibhav > > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <[EMAIL PROTECTED]> wrote: > >> Set num.retries in producer config property file. It defaults to 0. >> >> Thanks, >> >> Jun >> >> On Tue, Jun 26, 2012 at 4:46 PM, Vaibhav Puranik <[EMAIL PROTECTED]> >> wrote: >> >> > I reduced the batch size and reduced the pooled connections. Number of >> > errors have gone down significantly. But they are not eliminated yet. >> > >> > We definitely don't want to loose any events. >> > >> > Jun, how do I configure the client resend you mentioned below? I couldn't >> > find any configuration. >> > >> > Regards, >> > Vaibhav >> > >> > On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <[EMAIL PROTECTED]> >> > wrote: >> > >> > > These are great pointers. >> > > I found some more discussion here: >> > > https://forums.aws.amazon.com/thread.jspa?threadID=33427 >> > > >> > > I can do the following to keep using the elastic load balancer: >> > > >> > > 1) Reduce the producer pool size to 1 or 2 because looks like >> connections >> > > are sitting idle. My volume does not desire that big pool. >> > > 2) Reduce the batch size so that the webapp frequently dumps the data >> to >> > > brokers. It's better for us anyways. >> > > >> > > I will try both of these options and report back. >> > > >> > > Thank you very much Jun and Niek. >> > > >> > > Regards, >> > > Vaibhav >> > > >> > > >> > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <[EMAIL PROTECTED] >> > >wrote: >> > > >> > >> ELBs will close connections that have no data going across them over a >> > >> 60 sec period. A reference to this behavior can be found at the >> > >> bottom of this page: >> > >> >> > >> http://aws.amazon.com/articles/1636185810492479 >> > >> >> > >> There is currently no way for customers to increase this timeout. If >> > >> this timeout is in fact the problem, then the alternative is to use HA >> > >> proxy for load balancing instead. >> > >> >> > >> - Niek >> > >> >> > >> >> > >> >> > >> >> > >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <[EMAIL PROTECTED]> wrote: >> > >> > Vaibhav, >> > >> > >> > >> > Does elastic load balancer have any timeouts or quotas that kill >> > >> existing >> > >> > socket connections? Does client resend succeed (you can configure >> > >> resend in >> > >> > DefaultEventHandler)? >> > >> > >> > >> > Thanks, >> > >> > >> > >> > Jun >> > >> > >> > >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik < >> [EMAIL PROTECTED]> >> > >> wrote: >> > >> > >> > >> >> Hi all, >> > >> >> >> > >> >> We are sending our ad impressions to Kafka 0.7.0. I am using async >> > >> >> prouducers in our web app. >> > >> >> I am pooling kafak producers with commons pool. Pool size - 10. >> > >> batch.size >> > >> >> is 100. >> > >> >> >> > >> >> We have 3 c1.xlarge instances with Kafka brokers installed behind a >> > >> elastic >> > >> >> load balancer in AWS. >> > >> >> Every minute we loose some events because of the following >> exception >> > >> >> >> > >> >> - Disconnecting from >> > >> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 >> > >> >> - Error in handling batch of 64 events >> > >> >> java.io.IOException: Connection timed out >> > >> >> at sun.nio.ch.FileDispatcher.write0(Native Method)
-
Re: Getting timeouts with elastic load balancer in AWSJun Rao 2012-06-27, 21:56
num.retries is added in 0.7.1, which is just out.
Thanks, Jun On Wed, Jun 27, 2012 at 2:43 PM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote: > Jun, > > I wrote a test producer to test if num.retries working or not. But I found > that it's not working. No matter how many retries I set, whenever a > message send fails, it always never gets to the broker. > I am using Kafka 0.7.0 > > Is this a known problem? Do I need to file a JIRA issue? > > Because we are using Async producer we have no way to catch the exception > ourselves and act on it. Is that right? Any ideas how we can ensure that > every single message is sent with retries? > > Regards, > Vaibhav > > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <[EMAIL PROTECTED]> wrote: > > > Set num.retries in producer config property file. It defaults to 0. > > > > Thanks, > > > > Jun > > > > On Tue, Jun 26, 2012 at 4:46 PM, Vaibhav Puranik <[EMAIL PROTECTED]> > > wrote: > > > > > I reduced the batch size and reduced the pooled connections. Number of > > > errors have gone down significantly. But they are not eliminated yet. > > > > > > We definitely don't want to loose any events. > > > > > > Jun, how do I configure the client resend you mentioned below? I > couldn't > > > find any configuration. > > > > > > Regards, > > > Vaibhav > > > > > > On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <[EMAIL PROTECTED]> > > > wrote: > > > > > > > These are great pointers. > > > > I found some more discussion here: > > > > https://forums.aws.amazon.com/thread.jspa?threadID=33427 > > > > > > > > I can do the following to keep using the elastic load balancer: > > > > > > > > 1) Reduce the producer pool size to 1 or 2 because looks like > > connections > > > > are sitting idle. My volume does not desire that big pool. > > > > 2) Reduce the batch size so that the webapp frequently dumps the data > > to > > > > brokers. It's better for us anyways. > > > > > > > > I will try both of these options and report back. > > > > > > > > Thank you very much Jun and Niek. > > > > > > > > Regards, > > > > Vaibhav > > > > > > > > > > > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders < > [EMAIL PROTECTED] > > > >wrote: > > > > > > > >> ELBs will close connections that have no data going across them > over a > > > >> 60 sec period. A reference to this behavior can be found at the > > > >> bottom of this page: > > > >> > > > >> http://aws.amazon.com/articles/1636185810492479 > > > >> > > > >> There is currently no way for customers to increase this timeout. > If > > > >> this timeout is in fact the problem, then the alternative is to use > HA > > > >> proxy for load balancing instead. > > > >> > > > >> - Niek > > > >> > > > >> > > > >> > > > >> > > > >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <[EMAIL PROTECTED]> wrote: > > > >> > Vaibhav, > > > >> > > > > >> > Does elastic load balancer have any timeouts or quotas that kill > > > >> existing > > > >> > socket connections? Does client resend succeed (you can configure > > > >> resend in > > > >> > DefaultEventHandler)? > > > >> > > > > >> > Thanks, > > > >> > > > > >> > Jun > > > >> > > > > >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik < > > [EMAIL PROTECTED]> > > > >> wrote: > > > >> > > > > >> >> Hi all, > > > >> >> > > > >> >> We are sending our ad impressions to Kafka 0.7.0. I am using > async > > > >> >> prouducers in our web app. > > > >> >> I am pooling kafak producers with commons pool. Pool size - 10. > > > >> batch.size > > > >> >> is 100. > > > >> >> > > > >> >> We have 3 c1.xlarge instances with Kafka brokers installed > behind a > > > >> elastic > > > >> >> load balancer in AWS. > > > >> >> Every minute we loose some events because of the following > > exception > > > >> >> > > > >> >> - Disconnecting from > > > >> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 > > > >> >> - Error in handling batch of 64 events > > > >> >> java.io.IOException: Connection timed out > > > >> >> at sun.nio.ch.FileDispatcher.write0(Native Method)
-
Re: Getting timeouts with elastic load balancer in AWSVaibhav Puranik 2012-06-27, 22:03
Thanks Jun. How do I download 0.7.1?
I checked SVN tags but the last tag seems to be kafka-0.7.1-incubating-candidate-3/<http://svn.apache.org/repos/asf/incubator/kafka/tags/kafka-0.7.1-incubating-candidate-3/> Regards, Vaibhav On Wed, Jun 27, 2012 at 2:56 PM, Jun Rao <[EMAIL PROTECTED]> wrote: > num.retries is added in 0.7.1, which is just out. > > Thanks, > > Jun > > On Wed, Jun 27, 2012 at 2:43 PM, Vaibhav Puranik <[EMAIL PROTECTED]> > wrote: > > > Jun, > > > > I wrote a test producer to test if num.retries working or not. But I > found > > that it's not working. No matter how many retries I set, whenever a > > message send fails, it always never gets to the broker. > > I am using Kafka 0.7.0 > > > > Is this a known problem? Do I need to file a JIRA issue? > > > > Because we are using Async producer we have no way to catch the exception > > ourselves and act on it. Is that right? Any ideas how we can ensure that > > every single message is sent with retries? > > > > Regards, > > Vaibhav > > > > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <[EMAIL PROTECTED]> wrote: > > > > > Set num.retries in producer config property file. It defaults to 0. > > > > > > Thanks, > > > > > > Jun > > > > > > On Tue, Jun 26, 2012 at 4:46 PM, Vaibhav Puranik <[EMAIL PROTECTED]> > > > wrote: > > > > > > > I reduced the batch size and reduced the pooled connections. Number > of > > > > errors have gone down significantly. But they are not eliminated yet. > > > > > > > > We definitely don't want to loose any events. > > > > > > > > Jun, how do I configure the client resend you mentioned below? I > > couldn't > > > > find any configuration. > > > > > > > > Regards, > > > > Vaibhav > > > > > > > > On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <[EMAIL PROTECTED] > > > > > > wrote: > > > > > > > > > These are great pointers. > > > > > I found some more discussion here: > > > > > https://forums.aws.amazon.com/thread.jspa?threadID=33427 > > > > > > > > > > I can do the following to keep using the elastic load balancer: > > > > > > > > > > 1) Reduce the producer pool size to 1 or 2 because looks like > > > connections > > > > > are sitting idle. My volume does not desire that big pool. > > > > > 2) Reduce the batch size so that the webapp frequently dumps the > data > > > to > > > > > brokers. It's better for us anyways. > > > > > > > > > > I will try both of these options and report back. > > > > > > > > > > Thank you very much Jun and Niek. > > > > > > > > > > Regards, > > > > > Vaibhav > > > > > > > > > > > > > > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders < > > [EMAIL PROTECTED] > > > > >wrote: > > > > > > > > > >> ELBs will close connections that have no data going across them > > over a > > > > >> 60 sec period. A reference to this behavior can be found at the > > > > >> bottom of this page: > > > > >> > > > > >> http://aws.amazon.com/articles/1636185810492479 > > > > >> > > > > >> There is currently no way for customers to increase this timeout. > > If > > > > >> this timeout is in fact the problem, then the alternative is to > use > > HA > > > > >> proxy for load balancing instead. > > > > >> > > > > >> - Niek > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <[EMAIL PROTECTED]> > wrote: > > > > >> > Vaibhav, > > > > >> > > > > > >> > Does elastic load balancer have any timeouts or quotas that kill > > > > >> existing > > > > >> > socket connections? Does client resend succeed (you can > configure > > > > >> resend in > > > > >> > DefaultEventHandler)? > > > > >> > > > > > >> > Thanks, > > > > >> > > > > > >> > Jun > > > > >> > > > > > >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik < > > > [EMAIL PROTECTED]> > > > > >> wrote: > > > > >> > > > > > >> >> Hi all, > > > > >> >> > > > > >> >> We are sending our ad impressions to Kafka 0.7.0. I am using > > async > > > > >> >> prouducers in our web app. > > > > >> >> I am pooling kafak producers with commons pool. Pool size - 10. > >
-
Re: Getting timeouts with elastic load balancer in AWSNeha Narkhede 2012-06-27, 22:13
You can download it from here -
https://www.apache.org/dyn/closer.cgi/incubator/kafka/kafka-0.7.1-incubating/ Thanks, Neha On Wed, Jun 27, 2012 at 3:03 PM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote: > Thanks Jun. How do I download 0.7.1? > > I checked SVN tags but the last tag seems to be > kafka-0.7.1-incubating-candidate-3/<http://svn.apache.org/repos/asf/incubator/kafka/tags/kafka-0.7.1-incubating-candidate-3/> > > Regards, > Vaibhav > > On Wed, Jun 27, 2012 at 2:56 PM, Jun Rao <[EMAIL PROTECTED]> wrote: > >> num.retries is added in 0.7.1, which is just out. >> >> Thanks, >> >> Jun >> >> On Wed, Jun 27, 2012 at 2:43 PM, Vaibhav Puranik <[EMAIL PROTECTED]> >> wrote: >> >> > Jun, >> > >> > I wrote a test producer to test if num.retries working or not. But I >> found >> > that it's not working. No matter how many retries I set, whenever a >> > message send fails, it always never gets to the broker. >> > I am using Kafka 0.7.0 >> > >> > Is this a known problem? Do I need to file a JIRA issue? >> > >> > Because we are using Async producer we have no way to catch the exception >> > ourselves and act on it. Is that right? Any ideas how we can ensure that >> > every single message is sent with retries? >> > >> > Regards, >> > Vaibhav >> > >> > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <[EMAIL PROTECTED]> wrote: >> > >> > > Set num.retries in producer config property file. It defaults to 0. >> > > >> > > Thanks, >> > > >> > > Jun >> > > >> > > On Tue, Jun 26, 2012 at 4:46 PM, Vaibhav Puranik <[EMAIL PROTECTED]> >> > > wrote: >> > > >> > > > I reduced the batch size and reduced the pooled connections. Number >> of >> > > > errors have gone down significantly. But they are not eliminated yet. >> > > > >> > > > We definitely don't want to loose any events. >> > > > >> > > > Jun, how do I configure the client resend you mentioned below? I >> > couldn't >> > > > find any configuration. >> > > > >> > > > Regards, >> > > > Vaibhav >> > > > >> > > > On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik <[EMAIL PROTECTED] >> > >> > > > wrote: >> > > > >> > > > > These are great pointers. >> > > > > I found some more discussion here: >> > > > > https://forums.aws.amazon.com/thread.jspa?threadID=33427 >> > > > > >> > > > > I can do the following to keep using the elastic load balancer: >> > > > > >> > > > > 1) Reduce the producer pool size to 1 or 2 because looks like >> > > connections >> > > > > are sitting idle. My volume does not desire that big pool. >> > > > > 2) Reduce the batch size so that the webapp frequently dumps the >> data >> > > to >> > > > > brokers. It's better for us anyways. >> > > > > >> > > > > I will try both of these options and report back. >> > > > > >> > > > > Thank you very much Jun and Niek. >> > > > > >> > > > > Regards, >> > > > > Vaibhav >> > > > > >> > > > > >> > > > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders < >> > [EMAIL PROTECTED] >> > > > >wrote: >> > > > > >> > > > >> ELBs will close connections that have no data going across them >> > over a >> > > > >> 60 sec period. A reference to this behavior can be found at the >> > > > >> bottom of this page: >> > > > >> >> > > > >> http://aws.amazon.com/articles/1636185810492479 >> > > > >> >> > > > >> There is currently no way for customers to increase this timeout. >> > If >> > > > >> this timeout is in fact the problem, then the alternative is to >> use >> > HA >> > > > >> proxy for load balancing instead. >> > > > >> >> > > > >> - Niek >> > > > >> >> > > > >> >> > > > >> >> > > > >> >> > > > >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <[EMAIL PROTECTED]> >> wrote: >> > > > >> > Vaibhav, >> > > > >> > >> > > > >> > Does elastic load balancer have any timeouts or quotas that kill >> > > > >> existing >> > > > >> > socket connections? Does client resend succeed (you can >> configure >> > > > >> resend in >> > > > >> > DefaultEventHandler)? >> > > > >> > >> > > > >> > Thanks, >> > > > >> > >> > > > >> > Jun >> > > > >> > >> > > > >> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <
-
Re: Getting timeouts with elastic load balancer in AWSVaibhav Puranik 2012-06-27, 22:18
Thanks Neha.
I will try num.retries again with this version and post my feedback here. Regards, Vaibhav On Wed, Jun 27, 2012 at 3:13 PM, Neha Narkhede <[EMAIL PROTECTED]>wrote: > You can download it from here - > > https://www.apache.org/dyn/closer.cgi/incubator/kafka/kafka-0.7.1-incubating/ > > Thanks, > Neha > > On Wed, Jun 27, 2012 at 3:03 PM, Vaibhav Puranik <[EMAIL PROTECTED]> > wrote: > > Thanks Jun. How do I download 0.7.1? > > > > I checked SVN tags but the last tag seems to be > > kafka-0.7.1-incubating-candidate-3/< > http://svn.apache.org/repos/asf/incubator/kafka/tags/kafka-0.7.1-incubating-candidate-3/ > > > > > > Regards, > > Vaibhav > > > > On Wed, Jun 27, 2012 at 2:56 PM, Jun Rao <[EMAIL PROTECTED]> wrote: > > > >> num.retries is added in 0.7.1, which is just out. > >> > >> Thanks, > >> > >> Jun > >> > >> On Wed, Jun 27, 2012 at 2:43 PM, Vaibhav Puranik <[EMAIL PROTECTED]> > >> wrote: > >> > >> > Jun, > >> > > >> > I wrote a test producer to test if num.retries working or not. But I > >> found > >> > that it's not working. No matter how many retries I set, whenever a > >> > message send fails, it always never gets to the broker. > >> > I am using Kafka 0.7.0 > >> > > >> > Is this a known problem? Do I need to file a JIRA issue? > >> > > >> > Because we are using Async producer we have no way to catch the > exception > >> > ourselves and act on it. Is that right? Any ideas how we can ensure > that > >> > every single message is sent with retries? > >> > > >> > Regards, > >> > Vaibhav > >> > > >> > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <[EMAIL PROTECTED]> wrote: > >> > > >> > > Set num.retries in producer config property file. It defaults to 0. > >> > > > >> > > Thanks, > >> > > > >> > > Jun > >> > > > >> > > On Tue, Jun 26, 2012 at 4:46 PM, Vaibhav Puranik < > [EMAIL PROTECTED]> > >> > > wrote: > >> > > > >> > > > I reduced the batch size and reduced the pooled connections. > Number > >> of > >> > > > errors have gone down significantly. But they are not eliminated > yet. > >> > > > > >> > > > We definitely don't want to loose any events. > >> > > > > >> > > > Jun, how do I configure the client resend you mentioned below? I > >> > couldn't > >> > > > find any configuration. > >> > > > > >> > > > Regards, > >> > > > Vaibhav > >> > > > > >> > > > On Tue, Jun 26, 2012 at 9:27 AM, Vaibhav Puranik < > [EMAIL PROTECTED] > >> > > >> > > > wrote: > >> > > > > >> > > > > These are great pointers. > >> > > > > I found some more discussion here: > >> > > > > https://forums.aws.amazon.com/thread.jspa?threadID=33427 > >> > > > > > >> > > > > I can do the following to keep using the elastic load balancer: > >> > > > > > >> > > > > 1) Reduce the producer pool size to 1 or 2 because looks like > >> > > connections > >> > > > > are sitting idle. My volume does not desire that big pool. > >> > > > > 2) Reduce the batch size so that the webapp frequently dumps the > >> data > >> > > to > >> > > > > brokers. It's better for us anyways. > >> > > > > > >> > > > > I will try both of these options and report back. > >> > > > > > >> > > > > Thank you very much Jun and Niek. > >> > > > > > >> > > > > Regards, > >> > > > > Vaibhav > >> > > > > > >> > > > > > >> > > > > On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders < > >> > [EMAIL PROTECTED] > >> > > > >wrote: > >> > > > > > >> > > > >> ELBs will close connections that have no data going across them > >> > over a > >> > > > >> 60 sec period. A reference to this behavior can be found at > the > >> > > > >> bottom of this page: > >> > > > >> > >> > > > >> http://aws.amazon.com/articles/1636185810492479 > >> > > > >> > >> > > > >> There is currently no way for customers to increase this > timeout. > >> > If > >> > > > >> this timeout is in fact the problem, then the alternative is to > >> use > >> > HA > >> > > > >> proxy for load balancing instead. > >> > > > >> > >> > > > >> - Niek > >> > > > >> > >> > > > >> > >> > > > >> > >> > > > >> > >> > > > >> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <[EMAIL PROTECTED]>
-
Re: Getting timeouts with elastic load balancer in AWSNiek Sanders 2012-06-27, 22:24
Do producers currently leave the sockets to the brokers open indefinitely?
It might make sense to add a second producer config param similar to "reconnect.interval" which limits on time instead of message count. (And then reconnect based on whichever criteria is hit first). For folks going through ELBs on AWS, they'd set the reconnect.interval.sec to something like 50 sec as a workaround for low-volume producers. - Niek On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <[EMAIL PROTECTED]> wrote: > Set num.retries in producer config property file. It defaults to 0. > > Thanks, > > Jun >
-
Re: Getting timeouts with elastic load balancer in AWSVaibhav Puranik 2012-06-27, 22:31
That will be awesome. It will definitely address AWS ELB problem.
+1 for "reconnect.interval". Regards, Vaibhav GumGum On Wed, Jun 27, 2012 at 3:24 PM, Niek Sanders <[EMAIL PROTECTED]>wrote: > Do producers currently leave the sockets to the brokers open indefinitely? > > It might make sense to add a second producer config param similar to > "reconnect.interval" which limits on time instead of message count. > (And then reconnect based on whichever criteria is hit first). For > folks going through ELBs on AWS, they'd set the reconnect.interval.sec > to something like 50 sec as a workaround for low-volume producers. > > - Niek > > > > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <[EMAIL PROTECTED]> wrote: > > Set num.retries in producer config property file. It defaults to 0. > > > > Thanks, > > > > Jun > > >
-
Re: Getting timeouts with elastic load balancer in AWSJoel Koshy 2012-06-27, 22:42
0.7.1 has this: reconnect.time.interval.ms
Thanks, Joel On Wed, Jun 27, 2012 at 3:31 PM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote: > That will be awesome. It will definitely address AWS ELB problem. > > +1 for "reconnect.interval". > > Regards, > Vaibhav > GumGum > > > On Wed, Jun 27, 2012 at 3:24 PM, Niek Sanders <[EMAIL PROTECTED] > >wrote: > > > Do producers currently leave the sockets to the brokers open > indefinitely? > > > > It might make sense to add a second producer config param similar to > > "reconnect.interval" which limits on time instead of message count. > > (And then reconnect based on whichever criteria is hit first). For > > folks going through ELBs on AWS, they'd set the reconnect.interval.sec > > to something like 50 sec as a workaround for low-volume producers. > > > > - Niek > > > > > > > > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <[EMAIL PROTECTED]> wrote: > > > Set num.retries in producer config property file. It defaults to 0. > > > > > > Thanks, > > > > > > Jun > > > > > >
-
Re: Getting timeouts with elastic load balancer in AWSVaibhav Puranik 2012-06-28, 17:44
Hi all,
I don't think the num.retries (0.7.1) is working. Here is how I tested it. I wrote a simple producer that sends messages with the following strings - "____1_____", "_____2_____"..... . As you can see all the messages are sequential. I tailed the topic log on broker. After sending every message, I have added Thread.sleep for 15 seconds. Everytime I send the message, it immediately appears in the broker log. But if I restart the broker to simulate producer connection drop (in the 15 seconds producer sleep period), it prints the following message in the logs: [2012-06-28 10:31:17,127] INFO Disconnecting from localhost:9092 (kafka.producer.SyncProducer) [2012-06-28 10:31:17,132] WARN Error sending messages, 2 attempts remaining (kafka.producer.async.DefaultEventHandler) [2012-06-28 10:31:17,132] INFO Connected to localhost:9092 for producing (kafka.producer.SyncProducer) But the message that was sent right after the broker restart never reaches the broker. The message after that (2nd message after restart) gets to broker fine and the sequence continues. Thus if I restart the broker in the sleep period between message 4 and 5. I don't get the message 5. I get message 1,2,3,4,6,7,..... I tried setting num.retries to 1 and 2 thinking that in the first retry it might reconnect and the second retry is where it's resending the message. But that doesn't work. Number of retries doesn't improve the situation. Can you see any flaw in my testing? What can I do to better test this scenario? How can I ensure that no messages are dropped? I don't think I am loosing the message because it's in broker memory. Please correct me if I am wrong. Regards, Vaibhav GumGum <http://gumgum.com> On Wed, Jun 27, 2012 at 3:42 PM, Joel Koshy <[EMAIL PROTECTED]> wrote: > 0.7.1 has this: reconnect.time.interval.ms > > Thanks, > > Joel > > On Wed, Jun 27, 2012 at 3:31 PM, Vaibhav Puranik <[EMAIL PROTECTED]> > wrote: > > > That will be awesome. It will definitely address AWS ELB problem. > > > > +1 for "reconnect.interval". > > > > Regards, > > Vaibhav > > GumGum > > > > > > On Wed, Jun 27, 2012 at 3:24 PM, Niek Sanders <[EMAIL PROTECTED] > > >wrote: > > > > > Do producers currently leave the sockets to the brokers open > > indefinitely? > > > > > > It might make sense to add a second producer config param similar to > > > "reconnect.interval" which limits on time instead of message count. > > > (And then reconnect based on whichever criteria is hit first). For > > > folks going through ELBs on AWS, they'd set the reconnect.interval.sec > > > to something like 50 sec as a workaround for low-volume producers. > > > > > > - Niek > > > > > > > > > > > > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <[EMAIL PROTECTED]> wrote: > > > > Set num.retries in producer config property file. It defaults to 0. > > > > > > > > Thanks, > > > > > > > > Jun > > > > > > > > > >
-
Re: Getting timeouts with elastic load balancer in AWSJun Rao 2012-06-28, 18:15
Could you enable trace logging in DefaultEventHandler to see if the
following message shows up after the warning? trace("kafka producer sent messages for topics %s to broker %s:%d (on attempt %d)" Thanks, Jun On Thu, Jun 28, 2012 at 10:44 AM, Vaibhav Puranik <[EMAIL PROTECTED]>wrote: > Hi all, > > I don't think the num.retries (0.7.1) is working. Here is how I tested it. > > I wrote a simple producer that sends messages with the following strings - > "____1_____", "_____2_____"..... . As you can see all the messages are > sequential. > I tailed the topic log on broker. After sending every message, I have added > Thread.sleep for 15 seconds. > > Everytime I send the message, it immediately appears in the broker log. But > if I restart the broker to simulate producer connection drop (in the 15 > seconds producer sleep period), it prints the following message in the > logs: > > [2012-06-28 10:31:17,127] INFO Disconnecting from localhost:9092 > (kafka.producer.SyncProducer) > [2012-06-28 10:31:17,132] WARN Error sending messages, 2 attempts remaining > (kafka.producer.async.DefaultEventHandler) > [2012-06-28 10:31:17,132] INFO Connected to localhost:9092 for producing > (kafka.producer.SyncProducer) > > But the message that was sent right after the broker restart never reaches > the broker. The message after that (2nd message after restart) gets to > broker fine and the sequence continues. Thus if I restart the broker in the > sleep period between message 4 and 5. I don't get the message 5. I get > message 1,2,3,4,6,7,..... > > I tried setting num.retries to 1 and 2 thinking that in the first retry it > might reconnect and the second retry is where it's resending the message. > But that doesn't work. Number of retries doesn't improve the situation. > > Can you see any flaw in my testing? What can I do to better test this > scenario? How can I ensure that no messages are dropped? I don't think I am > loosing the message because it's in broker memory. Please correct me if I > am wrong. > > Regards, > Vaibhav > GumGum <http://gumgum.com> > > > > On Wed, Jun 27, 2012 at 3:42 PM, Joel Koshy <[EMAIL PROTECTED]> wrote: > > > 0.7.1 has this: reconnect.time.interval.ms > > > > Thanks, > > > > Joel > > > > On Wed, Jun 27, 2012 at 3:31 PM, Vaibhav Puranik <[EMAIL PROTECTED]> > > wrote: > > > > > That will be awesome. It will definitely address AWS ELB problem. > > > > > > +1 for "reconnect.interval". > > > > > > Regards, > > > Vaibhav > > > GumGum > > > > > > > > > On Wed, Jun 27, 2012 at 3:24 PM, Niek Sanders <[EMAIL PROTECTED] > > > >wrote: > > > > > > > Do producers currently leave the sockets to the brokers open > > > indefinitely? > > > > > > > > It might make sense to add a second producer config param similar to > > > > "reconnect.interval" which limits on time instead of message count. > > > > (And then reconnect based on whichever criteria is hit first). For > > > > folks going through ELBs on AWS, they'd set the > reconnect.interval.sec > > > > to something like 50 sec as a workaround for low-volume producers. > > > > > > > > - Niek > > > > > > > > > > > > > > > > On Tue, Jun 26, 2012 at 4:52 PM, Jun Rao <[EMAIL PROTECTED]> wrote: > > > > > Set num.retries in producer config property file. It defaults to 0. > > > > > > > > > > Thanks, > > > > > > > > > > Jun > > > > > > > > > > > > > > >
-
Re: Getting timeouts with elastic load balancer in AWSVaibhav Puranik 2012-06-28, 20:36
Jun,
Here is the log with SynProducer and DefaultEventHandler trace enabled. http://pastebin.com/dTm5RSJ9 Here are my producer settings: properties.put("serializer.class", "kafka.serializer.StringEncoder") properties.put("broker.list", "0:localhost:9092") properties.put("producer.type", "async"); properties.put("num.retries", "3"); properties.put("batch.size", "5"); (This batch size does't work because I think the some flush time is small - 5 seconds - It sends every message as it comes). I am sleeping for 15 seconds between each messages. Here is my broker output: _____0_____#17;#1;{�D�_____1_____#17;#1;�&6c_____2_____#17;#1;6z��_____3_____#17;#1;+�~_____4_____#17;#1;f�tu_____6_____#17;#1;����_____7_____#17;#1;\�#21;_____8_____#17;#1;��Ơ_____9_____ Notice number 5 is missing. I restarted broker between 4 and 5. You can see that the message 5 is missing. On producer for some reason the error appears between 6 and 7. Don't know why. Regards, Vaibhav On Thu, Jun 28, 2012 at 11:15 AM, Jun Rao <[EMAIL PROTECTED]> wrote: > Could you enable trace logging in DefaultEventHandler to see if the > following message shows up after the warning? > trace("kafka producer sent messages for topics %s to broker %s:%d > (on attempt %d)" > > Thanks, > > Jun > > On Thu, Jun 28, 2012 at 10:44 AM, Vaibhav Puranik <[EMAIL PROTECTED] > >wrote: > > > Hi all, > > > > I don't think the num.retries (0.7.1) is working. Here is how I tested > it. > > > > I wrote a simple producer that sends messages with the following strings > - > > "____1_____", "_____2_____"..... . As you can see all the messages are > > sequential. > > I tailed the topic log on broker. After sending every message, I have > added > > Thread.sleep for 15 seconds. > > > > Everytime I send the message, it immediately appears in the broker log. > But > > if I restart the broker to simulate producer connection drop (in the 15 > > seconds producer sleep period), it prints the following message in the > > logs: > > > > [2012-06-28 10:31:17,127] INFO Disconnecting from localhost:9092 > > (kafka.producer.SyncProducer) > > [2012-06-28 10:31:17,132] WARN Error sending messages, 2 attempts > remaining > > (kafka.producer.async.DefaultEventHandler) > > [2012-06-28 10:31:17,132] INFO Connected to localhost:9092 for producing > > (kafka.producer.SyncProducer) > > > > But the message that was sent right after the broker restart never > reaches > > the broker. The message after that (2nd message after restart) gets to > > broker fine and the sequence continues. Thus if I restart the broker in > the > > sleep period between message 4 and 5. I don't get the message 5. I get > > message 1,2,3,4,6,7,..... > > > > I tried setting num.retries to 1 and 2 thinking that in the first retry > it > > might reconnect and the second retry is where it's resending the message. > > But that doesn't work. Number of retries doesn't improve the situation. > > > > Can you see any flaw in my testing? What can I do to better test this > > scenario? How can I ensure that no messages are dropped? I don't think I > am > > loosing the message because it's in broker memory. Please correct me if I > > am wrong. > > > > Regards, > > Vaibhav > > GumGum <http://gumgum.com> > > > > > > > > On Wed, Jun 27, 2012 at 3:42 PM, Joel Koshy <[EMAIL PROTECTED]> wrote: > > > > > 0.7.1 has this: reconnect.time.interval.ms > > > > > > Thanks, > > > > > > Joel > > > > > > On Wed, Jun 27, 2012 at 3:31 PM, Vaibhav Puranik <[EMAIL PROTECTED]> > > > wrote: > > > > > > > That will be awesome. It will definitely address AWS ELB problem. > > > > > > > > +1 for "reconnect.interval". > > > > > > > > Regards, > > > > Vaibhav > > > > GumGum > > > > > > > > > > > > On Wed, Jun 27, 2012 at 3:24 PM, Niek Sanders < > [EMAIL PROTECTED] > > > > >wrote: > > > > > > > > > Do producers currently leave the sockets to the brokers open > > > > indefinitely? > > > > > > > > > > It might make sense to add a second producer config param similar
-
Re: Getting timeouts with elastic load balancer in AWSJoel Koshy 2012-06-28, 21:37
Just to clarify: num.retries > 0 does not guarantee that all messages will
be received at the broker. It guarantees retry on exceptions - so it cannot handle the corner case when the broker goes down after the message is written to the socket buffer but before the buffer is flushed (in which case no exceptions are thrown). This is addressed in 0.8 with producer acks. That said, you have a fairly large interval between messages so it's rather surprising. It might help to correlate this with broker-side logs to see if the "Message sent" for message 5 was actually received on the broker. Thanks, Joel On Thu, Jun 28, 2012 at 1:36 PM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote: > Jun, > > Here is the log with SynProducer and DefaultEventHandler trace enabled. > > http://pastebin.com/dTm5RSJ9 > > Here are my producer settings: > > properties.put("serializer.class", "kafka.serializer.StringEncoder") > properties.put("broker.list", "0:localhost:9092") > properties.put("producer.type", "async"); > properties.put("num.retries", "3"); > properties.put("batch.size", "5"); > > (This batch size does't work because I think the some flush time is small > - 5 seconds - It sends every message as it comes). I am sleeping for 15 > seconds between each messages. > > Here is my broker output: > _____0_____ {�D�_____1_____ �&6c_____2_____ 6z��_____3_____ + > �~_____4_____ f�tu_____6_____ ����_____7_____ \� _____8_____ > ��Ơ_____9_____ > > > Notice number 5 is missing. I restarted broker between 4 and 5. You can see > that the message 5 is missing. On producer for some reason the error > appears between 6 and 7. Don't know why. > > Regards, > Vaibhav > > > On Thu, Jun 28, 2012 at 11:15 AM, Jun Rao <[EMAIL PROTECTED]> wrote: > > > Could you enable trace logging in DefaultEventHandler to see if the > > following message shows up after the warning? > > trace("kafka producer sent messages for topics %s to broker > %s:%d > > (on attempt %d)" > > > > Thanks, > > > > Jun > > > > On Thu, Jun 28, 2012 at 10:44 AM, Vaibhav Puranik <[EMAIL PROTECTED] > > >wrote: > > > > > Hi all, > > > > > > I don't think the num.retries (0.7.1) is working. Here is how I tested > > it. > > > > > > I wrote a simple producer that sends messages with the following > strings > > - > > > "____1_____", "_____2_____"..... . As you can see all the messages are > > > sequential. > > > I tailed the topic log on broker. After sending every message, I have > > added > > > Thread.sleep for 15 seconds. > > > > > > Everytime I send the message, it immediately appears in the broker log. > > But > > > if I restart the broker to simulate producer connection drop (in the 15 > > > seconds producer sleep period), it prints the following message in the > > > logs: > > > > > > [2012-06-28 10:31:17,127] INFO Disconnecting from localhost:9092 > > > (kafka.producer.SyncProducer) > > > [2012-06-28 10:31:17,132] WARN Error sending messages, 2 attempts > > remaining > > > (kafka.producer.async.DefaultEventHandler) > > > [2012-06-28 10:31:17,132] INFO Connected to localhost:9092 for > producing > > > (kafka.producer.SyncProducer) > > > > > > But the message that was sent right after the broker restart never > > reaches > > > the broker. The message after that (2nd message after restart) gets to > > > broker fine and the sequence continues. Thus if I restart the broker in > > the > > > sleep period between message 4 and 5. I don't get the message 5. I get > > > message 1,2,3,4,6,7,..... > > > > > > I tried setting num.retries to 1 and 2 thinking that in the first retry > > it > > > might reconnect and the second retry is where it's resending the > message. > > > But that doesn't work. Number of retries doesn't improve the situation. > > > > > > Can you see any flaw in my testing? What can I do to better test this > > > scenario? How can I ensure that no messages are dropped? I don't think > I > > am > > > loosing the message because it's in broker memory. Please correct me > if I > >
-
Re: Getting timeouts with elastic load balancer in AWSVaibhav Puranik 2012-06-29, 00:40
Just to remove all the variables regarding me restarting the broker, I did
a test with Amazon ELB. (0.7.1 producer and 0.7.0 broker) Thus, no broker restarts. The connection was getting broken because Amazon ELB was closing all the connections. I found the exact same result. In spite of specifying num.retries and reconnect.time.interval.ms = 50000, we loose one batch. I understand that num.retries does not gurantee that all the messages will be sent. But I feel like it should do it in this case though. Please let me know if my expectation is unjust. Regards, Vaibhav On Thu, Jun 28, 2012 at 2:37 PM, Joel Koshy <[EMAIL PROTECTED]> wrote: > Just to clarify: num.retries > 0 does not guarantee that all messages will > be received at the broker. It guarantees retry on exceptions - so it cannot > handle the corner case when the broker goes down after the message is > written to the socket buffer but before the buffer is flushed (in which > case no exceptions are thrown). This is addressed in 0.8 with producer > acks. > > That said, you have a fairly large interval between messages so it's rather > surprising. It might help to correlate this with broker-side logs to see if > the "Message sent" for message 5 was actually received on the broker. > > Thanks, > > Joel > > On Thu, Jun 28, 2012 at 1:36 PM, Vaibhav Puranik <[EMAIL PROTECTED]> > wrote: > > > Jun, > > > > Here is the log with SynProducer and DefaultEventHandler trace enabled. > > > > http://pastebin.com/dTm5RSJ9 > > > > Here are my producer settings: > > > > properties.put("serializer.class", "kafka.serializer.StringEncoder") > > properties.put("broker.list", "0:localhost:9092") > > properties.put("producer.type", "async"); > > properties.put("num.retries", "3"); > > properties.put("batch.size", "5"); > > > > (This batch size does't work because I think the some flush time is > small > > - 5 seconds - It sends every message as it comes). I am sleeping for 15 > > seconds between each messages. > > > > Here is my broker output: > > _____0_____ {�D�_____1_____ �&6c_____2_____ 6z��_____3_____ + > > �~_____4_____ f�tu_____6_____ ����_____7_____ \� _____8_____ > > ��Ơ_____9_____ > > > > > > Notice number 5 is missing. I restarted broker between 4 and 5. You can > see > > that the message 5 is missing. On producer for some reason the error > > appears between 6 and 7. Don't know why. > > > > Regards, > > Vaibhav > > > > > > On Thu, Jun 28, 2012 at 11:15 AM, Jun Rao <[EMAIL PROTECTED]> wrote: > > > > > Could you enable trace logging in DefaultEventHandler to see if the > > > following message shows up after the warning? > > > trace("kafka producer sent messages for topics %s to broker > > %s:%d > > > (on attempt %d)" > > > > > > Thanks, > > > > > > Jun > > > > > > On Thu, Jun 28, 2012 at 10:44 AM, Vaibhav Puranik <[EMAIL PROTECTED] > > > >wrote: > > > > > > > Hi all, > > > > > > > > I don't think the num.retries (0.7.1) is working. Here is how I > tested > > > it. > > > > > > > > I wrote a simple producer that sends messages with the following > > strings > > > - > > > > "____1_____", "_____2_____"..... . As you can see all the messages > are > > > > sequential. > > > > I tailed the topic log on broker. After sending every message, I have > > > added > > > > Thread.sleep for 15 seconds. > > > > > > > > Everytime I send the message, it immediately appears in the broker > log. > > > But > > > > if I restart the broker to simulate producer connection drop (in the > 15 > > > > seconds producer sleep period), it prints the following message in > the > > > > logs: > > > > > > > > [2012-06-28 10:31:17,127] INFO Disconnecting from localhost:9092 > > > > (kafka.producer.SyncProducer) > > > > [2012-06-28 10:31:17,132] WARN Error sending messages, 2 attempts > > > remaining > > > > (kafka.producer.async.DefaultEventHandler) > > > > [2012-06-28 10:31:17,132] INFO Connected to localhost:9092 for > > producing > > > > (kafka.producer.SyncProducer) > > > > > > > > But the message that was sent right after the broker restart never
-
Re: Getting timeouts with elastic load balancer in AWSJun Rao 2012-06-29, 01:40
>From the log, it seems that while message 6 was being sent, it hit an
exception (potentially due to broker down) and caused a resend. And it seems that message 6 reached the broker. When message 5 was sent, it didn't hit any exception. So there is no resend. The reason that message 5 didn't reach broker could be that the broker was shut down before producer socket buffer was flushed. Thanks, Jun On Thu, Jun 28, 2012 at 1:36 PM, Vaibhav Puranik <[EMAIL PROTECTED]> wrote: > Jun, > > Here is the log with SynProducer and DefaultEventHandler trace enabled. > > http://pastebin.com/dTm5RSJ9 > > Here are my producer settings: > > properties.put("serializer.class", "kafka.serializer.StringEncoder") > properties.put("broker.list", "0:localhost:9092") > properties.put("producer.type", "async"); > properties.put("num.retries", "3"); > properties.put("batch.size", "5"); > > (This batch size does't work because I think the some flush time is small > - 5 seconds - It sends every message as it comes). I am sleeping for 15 > seconds between each messages. > > Here is my broker output: > _____0_____ {�D�_____1_____ �&6c_____2_____ 6z��_____3_____ + > �~_____4_____ f�tu_____6_____ ����_____7_____ \� _____8_____ > ��Ơ_____9_____ > > > Notice number 5 is missing. I restarted broker between 4 and 5. You can see > that the message 5 is missing. On producer for some reason the error > appears between 6 and 7. Don't know why. > > Regards, > Vaibhav > > > On Thu, Jun 28, 2012 at 11:15 AM, Jun Rao <[EMAIL PROTECTED]> wrote: > > > Could you enable trace logging in DefaultEventHandler to see if the > > following message shows up after the warning? > > trace("kafka producer sent messages for topics %s to broker > %s:%d > > (on attempt %d)" > > > > Thanks, > > > > Jun > > > > On Thu, Jun 28, 2012 at 10:44 AM, Vaibhav Puranik <[EMAIL PROTECTED] > > >wrote: > > > > > Hi all, > > > > > > I don't think the num.retries (0.7.1) is working. Here is how I tested > > it. > > > > > > I wrote a simple producer that sends messages with the following > strings > > - > > > "____1_____", "_____2_____"..... . As you can see all the messages are > > > sequential. > > > I tailed the topic log on broker. After sending every message, I have > > added > > > Thread.sleep for 15 seconds. > > > > > > Everytime I send the message, it immediately appears in the broker log. > > But > > > if I restart the broker to simulate producer connection drop (in the 15 > > > seconds producer sleep period), it prints the following message in the > > > logs: > > > > > > [2012-06-28 10:31:17,127] INFO Disconnecting from localhost:9092 > > > (kafka.producer.SyncProducer) > > > [2012-06-28 10:31:17,132] WARN Error sending messages, 2 attempts > > remaining > > > (kafka.producer.async.DefaultEventHandler) > > > [2012-06-28 10:31:17,132] INFO Connected to localhost:9092 for > producing > > > (kafka.producer.SyncProducer) > > > > > > But the message that was sent right after the broker restart never > > reaches > > > the broker. The message after that (2nd message after restart) gets to > > > broker fine and the sequence continues. Thus if I restart the broker in > > the > > > sleep period between message 4 and 5. I don't get the message 5. I get > > > message 1,2,3,4,6,7,..... > > > > > > I tried setting num.retries to 1 and 2 thinking that in the first retry > > it > > > might reconnect and the second retry is where it's resending the > message. > > > But that doesn't work. Number of retries doesn't improve the situation. > > > > > > Can you see any flaw in my testing? What can I do to better test this > > > scenario? How can I ensure that no messages are dropped? I don't think > I > > am > > > loosing the message because it's in broker memory. Please correct me > if I > > > am wrong. > > > > > > Regards, > > > Vaibhav > > > GumGum <http://gumgum.com> > > > > > > > > > > > > On Wed, Jun 27, 2012 at 3:42 PM, Joel Koshy <[EMAIL PROTECTED]> > wrote: > > > > > > > 0.7.1 has this: reconnect.time.interval.ms |