Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Getting timeouts with elastic load balancer in AWS


Copy link to this message
-
Re: Getting timeouts with elastic load balancer in AWS
Vaibhav Puranik 2012-06-26, 16:27
These are great pointers.
I found some more discussion here:
https://forums.aws.amazon.com/thread.jspa?threadID=33427

I can do the following to keep using the elastic load balancer:

1) Reduce the producer pool size to 1 or 2 because looks like connections
are sitting idle. My volume does not desire that big pool.
2) Reduce the batch size so that the webapp frequently dumps the data to
brokers. It's better for us anyways.

I will try both of these options and report back.

Thank you very much Jun and Niek.

Regards,
Vaibhav

On Tue, Jun 26, 2012 at 8:52 AM, Niek Sanders <[EMAIL PROTECTED]>wrote:

> ELBs will close connections that have no data going across them over a
> 60 sec period.  A reference to this behavior can be found at the
> bottom of this page:
>
> http://aws.amazon.com/articles/1636185810492479
>
> There is currently no way for customers to increase this timeout.  If
> this timeout is in fact the problem, then the alternative is to use HA
> proxy for load balancing instead.
>
> - Niek
>
>
>
>
> On Tue, Jun 26, 2012 at 7:55 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
> > Vaibhav,
> >
> > Does elastic load balancer have any timeouts or quotas that kill existing
> > socket connections? Does client resend succeed (you can configure resend
> in
> > DefaultEventHandler)?
> >
> > Thanks,
> >
> > Jun
> >
> > On Mon, Jun 25, 2012 at 6:01 PM, Vaibhav Puranik <[EMAIL PROTECTED]>
> wrote:
> >
> >> Hi all,
> >>
> >> We are sending our ad impressions to Kafka 0.7.0. I am using async
> >> prouducers in our web app.
> >> I am pooling kafak producers with commons pool. Pool size - 10.
> batch.size
> >> is 100.
> >>
> >> We have 3 c1.xlarge instances with Kafka brokers installed behind a
> elastic
> >> load balancer in AWS.
> >> Every minute we loose some events because of the following exception
> >>
> >> - Disconnecting from
> dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092
> >> - Error in handling batch of 64 events
> >> java.io.IOException: Connection timed out
> >>    at sun.nio.ch.FileDispatcher.write0(Native Method)
> >>    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
> >>    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
> >>    at sun.nio.ch.IOUtil.write(IOUtil.java:75)
> >>    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
> >>    at
> >>
> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:51)
> >>    at kafka.network.Send$class.writeCompletely(Transmission.scala:76)
> >>    at
> >>
> >>
> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25)
> >>    at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:88)
> >>    at kafka.producer.SyncProducer.send(SyncProducer.scala:87)
> >>    at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:128)
> >>    at
> >>
> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:52)
> >>    at
> >>
> >>
> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:46)
> >>    at
> >>
> >>
> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:119)
> >>    at
> >>
> >>
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:98)
> >>    at
> >>
> >>
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:74)
> >>    at scala.collection.immutable.Stream.foreach(Stream.scala:254)
> >>    at
> >>
> >>
> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:73)
> >>    at
> >> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:43)
> >> - Connected to dualstack.kafka-xyz.us-east-1.elb.amazonaws.com:9092 for
> >> producing
> >>
> >> Has anybody faced this kind of timeouts before? Do they indicate any
> >> resource misconfiguration? The CPU usage on broker is pretty small.
> >> Also, in spite of setting batch size to 100, the failing batch usually
> only
> >> have 50 to 60 events. Is there any other limit I am hitting?