Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Occasional batch send errors


Copy link to this message
-
Re: Occasional batch send errors
It is highly recommended that Kafka and Zookeeper be deployed on different
boxes. Also make sure they get dedicated disks, separate from log4j and the
OS.

Thanks,
Neha

On Wednesday, April 24, 2013, Karl Kirch wrote:

> So switched to sync producer to see what would happen.
> I still get the connection reset by peer error randomly (I say randomly,
> but seems to be connected to some zookeeper CancelledKeyExceptions), but
> unfortunately it throws an error on the message after the one that didn't
> get sent.
>
> Is that the way its supposed to work?
>
> Karl
>
> On Apr 23, 2013, at 7:18 PM, Xavier Stevens <[EMAIL PROTECTED]<javascript:;>
> >
>  wrote:
>
> > Usually when these types of errors are because you're not connecting to
> the
> > proper host:port. Double check your configs, make sure everything is
> > running and listening on the host:port you think they are.
> >
> > Have you tried using the sync producer to work out your bugs? My guess is
> > the sync producer would fail on the first message rather than failing
> when
> > the batch is submitted.
> >
> >
> > On Tue, Apr 23, 2013 at 4:57 PM, Karl Kirch <[EMAIL PROTECTED]> wrote:
> >
> >> Hmmm… that didn't seem to help.
> >> Anyone else see this sort of errors?
> >>
> >> Karl
> >>
> >>
> >> On Apr 23, 2013, at 5:58 PM, Karl Kirch <[EMAIL PROTECTED]>
> >> wrote:
> >>
> >>> I'm going to try bumping up the "numRetries" key in my producer config.
> >>> Is this a good option in this case?
> >>> I am using the zookeeper connect option so I'm aware that I may get
> >> stuck retrying to a failed node, but if it's just a temporary network
> >> glitch I'll at least get a bit more of a chance to recover.
> >>>
> >>> Thanks,
> >>> Karl
> >>>
> >>> On Apr 23, 2013, at 5:35 PM, Karl Kirch <[EMAIL PROTECTED]>
> >>> wrote:
> >>>
> >>>> I occasionally am getting some batch send errors from the stock async
> >> producer. This is on a cluster of 3 kafka (0.7.2) and 3 zookeeper nodes.
> >>>> Is there anyway to check what happens when those batch errors occur?
> >>>> Or bump up the retry count? (looks like it only did a single retry).
> >>>>
> >>>> I need the speed of the async producer, but it needs to be reliable (I
> >> see a handful of these a day but in a weather alerting system it only
> takes
> >> missing one let alone 25 or 100/1000).
> >>>>
> >>>> Here's a stack trace of one of the errors that I'm seeing.
> >>>>
> >>>> 22:23:39.405 [ProducerSendThread-1824508747] WARN
> >> k.p.a.DefaultEventHandler - Error sending messages, 0 attempts remaining
> >>>> java.io.IOException: Connection reset by peer
> >>>>     at sun.nio.ch.FileDispatcher.writev0(Native Method) ~[na:1.6.0_24]
> >>>>     at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:33)
> >> ~[na:1.6.0_24]
> >>>>     at sun.nio.ch.IOUtil.write(IOUtil.java:125) ~[na:1.6.0_24]
> >>>>     at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:367)
> >> ~[na:1.6.0_24]
> >>>>     at java.nio.channels.SocketChannel.write(SocketChannel.java:360)
> >> ~[na:1.6.0_24]
> >>>>     at
> >>
> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:49)
> >> ~[apns-consumer-1.0.jar:na]
> >>>>     at kafka.network.Send$class.writeCompletely(Transmission.scala:73)
> >> ~[apns-consumer-1.0.jar:na]
> >>>>     at
> >>
> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25)
> >> ~[apns-consumer-1.0.jar:na]
> >>>>     at
> >> kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:95)
> >> ~[apns-consumer-1.0.jar:na]
> >>>>     at kafka.producer.SyncProducer.send(SyncProducer.scala:94)
> >> ~[apns-consumer-1.0.jar:na]
> >>>>     at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:135)
> >> ~[apns-consumer-1.0.jar:na]
> >>>>     at
> >>
> kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:58)
> >> [apns-consumer-1.0.jar:na]
> >>>>     at
> >>
> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:44)
> >> [apns-consumer-1.0.jar:na]
> >>>>     at