Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Getting timeouts with elastic load balancer in AWS


Copy link to this message
-
Re: Getting timeouts with elastic load balancer in AWS
Just to remove all the variables regarding me restarting the broker, I did
a test with Amazon ELB. (0.7.1 producer and 0.7.0 broker)
Thus, no broker restarts. The connection was getting broken because Amazon
ELB was closing all the connections.

I found the exact same result. In spite of specifying num.retries and
reconnect.time.interval.ms = 50000, we loose one batch. I understand that
num.retries does not gurantee that all the messages will be sent.
But I feel like it should do it in this case though. Please let me know if
my expectation is unjust.

Regards,
Vaibhav
On Thu, Jun 28, 2012 at 2:37 PM, Joel Koshy <[EMAIL PROTECTED]> wrote:

> Just to clarify: num.retries > 0 does not guarantee that all messages will
> be received at the broker. It guarantees retry on exceptions - so it cannot
> handle the corner case when the broker goes down after the message is
> written to the socket buffer but before the buffer is flushed (in which
> case no exceptions are thrown). This is addressed in 0.8 with producer
> acks.
>
> That said, you have a fairly large interval between messages so it's rather
> surprising. It might help to correlate this with broker-side logs to see if
> the "Message sent" for message 5 was actually received on the broker.
>
> Thanks,
>
> Joel
>
> On Thu, Jun 28, 2012 at 1:36 PM, Vaibhav Puranik <[EMAIL PROTECTED]>
> wrote:
>
> > Jun,
> >
> > Here is the log with SynProducer and DefaultEventHandler trace enabled.
> >
> > http://pastebin.com/dTm5RSJ9
> >
> > Here are my producer settings:
> >
> > properties.put("serializer.class", "kafka.serializer.StringEncoder")
> > properties.put("broker.list", "0:localhost:9092")
> > properties.put("producer.type", "async");
> > properties.put("num.retries", "3");
> > properties.put("batch.size", "5");
> >
> > (This batch size does't work because I think the some flush time  is
> small
> > - 5 seconds - It sends every message as it comes). I am sleeping for 15
> > seconds between each messages.
> >
> > Here is my broker output:
> > _____0_____  {�D�_____1_____  �&6c_____2_____  6z��_____3_____  +
> > �~_____4_____  f�tu_____6_____  ����_____7_____  \� _____8_____
> >  ��Ơ_____9_____
> >
> >
> > Notice number 5 is missing. I restarted broker between 4 and 5. You can
> see
> > that the message  5 is missing. On producer for some reason the error
> > appears between 6 and 7. Don't know why.
> >
> > Regards,
> > Vaibhav
> >
> >
> > On Thu, Jun 28, 2012 at 11:15 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
> >
> > > Could you enable trace logging in DefaultEventHandler to see if the
> > > following message shows up after the warning?
> > >          trace("kafka producer sent messages for topics %s to broker
> > %s:%d
> > > (on attempt %d)"
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Thu, Jun 28, 2012 at 10:44 AM, Vaibhav Puranik <[EMAIL PROTECTED]
> > > >wrote:
> > >
> > > > Hi all,
> > > >
> > > > I don't think the num.retries (0.7.1) is working. Here is how I
> tested
> > > it.
> > > >
> > > > I wrote a simple producer that sends messages with the following
> > strings
> > > -
> > > > "____1_____", "_____2_____"..... . As you can see all the messages
> are
> > > > sequential.
> > > > I tailed the topic log on broker. After sending every message, I have
> > > added
> > > > Thread.sleep for 15 seconds.
> > > >
> > > > Everytime I send the message, it immediately appears in the broker
> log.
> > > But
> > > > if I restart the broker to simulate producer connection drop (in the
> 15
> > > > seconds producer sleep period), it prints the following message in
> the
> > > > logs:
> > > >
> > > > [2012-06-28 10:31:17,127] INFO Disconnecting from localhost:9092
> > > > (kafka.producer.SyncProducer)
> > > > [2012-06-28 10:31:17,132] WARN Error sending messages, 2 attempts
> > > remaining
> > > > (kafka.producer.async.DefaultEventHandler)
> > > > [2012-06-28 10:31:17,132] INFO Connected to localhost:9092 for
> > producing
> > > > (kafka.producer.SyncProducer)
> > > >
> > > > But the message that was sent right after the broker restart never