|
Yonghui Zhao
2013-03-18, 03:25
Jun Rao
2013-03-18, 16:25
Yonghui Zhao
2013-03-19, 07:29
Yonghui Zhao
2013-03-19, 08:35
Jun Rao
2013-03-19, 16:15
Yonghui Zhao
2013-03-21, 04:26
Jun Rao
2013-03-22, 15:06
Yonghui Zhao
2013-03-22, 15:46
Yonghui Zhao
2013-03-25, 06:14
Neha Narkhede
2013-03-25, 13:49
Yonghui Zhao
2013-03-26, 02:14
Neha Narkhede
2013-03-26, 03:16
Yonghui Zhao
2013-03-26, 03:49
Neha Narkhede
2013-03-26, 15:40
Yonghui Zhao
2013-03-26, 16:48
Neha Narkhede
2013-03-27, 14:52
Yonghui Zhao
2013-03-28, 03:35
Jun Rao
2013-03-28, 04:54
Yonghui Zhao
2013-03-28, 07:23
Jun Rao
2013-03-28, 14:52
Yonghui Zhao
2013-03-28, 15:21
Jun Rao
2013-03-28, 15:25
Yonghui Zhao
2013-03-28, 15:32
Jun Rao
2013-03-29, 04:03
|
-
Connection reset by peerYonghui Zhao 2013-03-18, 03:25
In kafka 0.7.2, I use a producer to send 200 million message to kafka
server. After sent 100 million this exception happend: In producer: Exception in thread "main" java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcher.writev0(Native Method) at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51) at sun.nio.ch.IOUtil.write(IOUtil.java:182) at sun.nio.ch.SocketChannelImpl.write0(SocketChannelImpl.java:383) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:406) at java.nio.channels.SocketChannel.write(SocketChannel.java:384) at kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:49) at kafka.network.Send$class.writeCompletely(Transmission.scala:73) at kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25) at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:95) at kafka.producer.SyncProducer.send(SyncProducer.scala:94) at kafka.producer.SyncProducer.send(SyncProducer.scala:125) at kafka.producer.ProducerPool$$anonfun$send$1.apply$mcVI$sp(ProducerPool.scala:114) at kafka.producer.ProducerPool$$anonfun$send$1.apply(ProducerPool.scala:100) at kafka.producer.ProducerPool$$anonfun$send$1.apply(ProducerPool.scala:100) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:57) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:43) at kafka.producer.ProducerPool.send(ProducerPool.scala:100) at kafka.producer.Producer.zkSend(Producer.scala:137) at kafka.producer.Producer.send(Producer.scala:99) at kafka.javaapi.producer.Producer.send(Producer.scala:103) In kafka server: [2013-03-16 06:59:49,491] ERROR Closing socket for /10.2.201.201 because of error (kafka.network.Processor) java.io.IOException: Connection reset by peer at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:456) at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:557) at kafka.message.FileMessageSet.writeTo(FileMessageSet.scala:102) at kafka.server.MessageSetSend.writeTo(MessageSetSend.scala:53) at kafka.network.MultiSend.writeTo(Transmission.scala:91) at kafka.network.Processor.write(SocketServer.scala:339) at kafka.network.Processor.run(SocketServer.scala:216) at java.lang.Thread.run(Thread.java:679) Have you ever seen this exception before, what's the root cause? Thanks
-
Re: Connection reset by peerJun Rao 2013-03-18, 16:25
The error you saw on the broker is for consumer requests, not for producer.
For the issues in the producer, are you using a VIP? Is there any firewall btw producer and broker? The typical issues with "connection reset" that we have seen are caused by the load balancer or the firewall killing idle connections. Thanks, Jun On Sun, Mar 17, 2013 at 8:24 PM, Yonghui Zhao <[EMAIL PROTECTED]> wrote: > In kafka 0.7.2, I use a producer to send 200 million message to kafka > server. > > After sent 100 million this exception happend: > > In producer: > > Exception in thread "main" java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcher.writev0(Native Method) > at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51) > at sun.nio.ch.IOUtil.write(IOUtil.java:182) > at sun.nio.ch.SocketChannelImpl.write0(SocketChannelImpl.java:383) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:406) > at java.nio.channels.SocketChannel.write(SocketChannel.java:384) > at > kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:49) > at kafka.network.Send$class.writeCompletely(Transmission.scala:73) > at > > kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25) > at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:95) > at kafka.producer.SyncProducer.send(SyncProducer.scala:94) > at kafka.producer.SyncProducer.send(SyncProducer.scala:125) > at > > kafka.producer.ProducerPool$$anonfun$send$1.apply$mcVI$sp(ProducerPool.scala:114) > at > kafka.producer.ProducerPool$$anonfun$send$1.apply(ProducerPool.scala:100) > at > kafka.producer.ProducerPool$$anonfun$send$1.apply(ProducerPool.scala:100) > at > > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:57) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:43) > at kafka.producer.ProducerPool.send(ProducerPool.scala:100) > at kafka.producer.Producer.zkSend(Producer.scala:137) > at kafka.producer.Producer.send(Producer.scala:99) > at kafka.javaapi.producer.Producer.send(Producer.scala:103) > > > In kafka server: > > [2013-03-16 06:59:49,491] ERROR Closing socket for /10.2.201.201 because > of > error (kafka.network.Processor) > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) > at > sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:456) > at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:557) > at kafka.message.FileMessageSet.writeTo(FileMessageSet.scala:102) > at kafka.server.MessageSetSend.writeTo(MessageSetSend.scala:53) > at kafka.network.MultiSend.writeTo(Transmission.scala:91) > at kafka.network.Processor.write(SocketServer.scala:339) > at kafka.network.Processor.run(SocketServer.scala:216) > at java.lang.Thread.run(Thread.java:679) > > Have you ever seen this exception before, what's the root cause? Thanks >
-
Re: Connection reset by peerYonghui Zhao 2013-03-19, 07:29
Thanks Jun.
Now I use onebox to test kafka, kafka server ip on zk is 127.0.0.1, network is not affected by external factors. Reset connection is not reproed, but I still find Broken pipe exceptions and a few zk exceptions. [2013-03-19 15:23:28,660] INFO Closed socket connection for client / 127.0.0.1:51902 which had sessionid 0x13d8152007b002c (org.apache.zookeeper.server.NIOServerCnxn) [2013-03-19 15:23:28,672] ERROR Unexpected Exception: (org.apache.zookeeper.server.NIOServerCnxn) java.nio.channels.CancelledKeyException at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77) at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:418) at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1509) at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:171) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:135) [2013-03-19 15:15:58,355] INFO Closing socket connection to /127.0.0.1. (kafka.network.Processor) [2013-03-19 15:16:00,161] INFO Closing socket connection to /127.0.0.1. (kafka.network.Processor) [2013-03-19 15:16:01,784] INFO Closing socket connection to /127.0.0.1. (kafka.network.Processor) [2013-03-19 15:16:04,751] INFO Closing socket connection to /127.0.0.1. (kafka.network.Processor) [2013-03-19 15:16:07,734] ERROR Closing socket for /127.0.0.1 because of error (kafka.network.Processor) java.io.IOException: Broken pipe at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:456) at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:557) at kafka.message.FileMessageSet.writeTo(FileMessageSet.scala:102) at kafka.server.MessageSetSend.writeTo(MessageSetSend.scala:53) at kafka.network.MultiSend.writeTo(Transmission.scala:91) at kafka.network.Processor.write(SocketServer.scala:339) at kafka.network.Processor.run(SocketServer.scala:216) at java.lang.Thread.run(Thread.java:679) 2013/3/19 Jun Rao <[EMAIL PROTECTED]> > The error you saw on the broker is for consumer requests, not for producer. > For the issues in the producer, are you using a VIP? Is there any firewall > btw producer and broker? The typical issues with "connection reset" that we > have seen are caused by the load balancer or the firewall killing idle > connections. > > Thanks, > > Jun > > On Sun, Mar 17, 2013 at 8:24 PM, Yonghui Zhao <[EMAIL PROTECTED]> > wrote: > > > In kafka 0.7.2, I use a producer to send 200 million message to kafka > > server. > > > > After sent 100 million this exception happend: > > > > In producer: > > > > Exception in thread "main" java.io.IOException: Connection reset by peer > > at sun.nio.ch.FileDispatcher.writev0(Native Method) > > at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51) > > at sun.nio.ch.IOUtil.write(IOUtil.java:182) > > at sun.nio.ch.SocketChannelImpl.write0(SocketChannelImpl.java:383) > > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:406) > > at java.nio.channels.SocketChannel.write(SocketChannel.java:384) > > at > > > kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:49) > > at kafka.network.Send$class.writeCompletely(Transmission.scala:73) > > at > > > > > kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:25) > > at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:95) > > at kafka.producer.SyncProducer.send(SyncProducer.scala:94) > > at kafka.producer.SyncProducer.send(SyncProducer.scala:125) > > at > > > > > kafka.producer.ProducerPool$$anonfun$send$1.apply$mcVI$sp(ProducerPool.scala:114) > > at > > kafka.producer.ProducerPool$$anonfun$send$1.apply(ProducerPool.scala:100) > > at > > kafka.producer.ProducerPool$$anonfun$send$1.apply(ProducerPool.scala:100)
-
Re: Connection reset by peerYonghui Zhao 2013-03-19, 08:35
Connection reset exception reproed.
[2013-03-19 16:30:45,814] INFO Closing socket connection to /127.0.0.1. (kafka.network.Processor) [2013-03-19 16:30:55,253] ERROR Closing socket for /127.0.0.1 because of error (kafka.network.Processor) java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251) at sun.nio.ch.IOUtil.read(IOUtil.java:224) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254) at kafka.utils.Utils$.read(Utils.scala:538) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) at kafka.network.Processor.read(SocketServer.scala:311) at kafka.network.Processor.run(SocketServer.scala:214) at java.lang.Thread.run(Thread.java:679) [2013-03-19 16:31:02,476] ERROR Closing socket for /127.0.0.1 because of error (kafka.network.Processor) java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251) at sun.nio.ch.IOUtil.read(IOUtil.java:224) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254) at kafka.utils.Utils$.read(Utils.scala:538) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) at kafka.network.Processor.read(SocketServer.scala:311) at kafka.network.Processor.run(SocketServer.scala:214) at java.lang.Thread.run(Thread.java:679) 2013/3/19 Yonghui Zhao <[EMAIL PROTECTED]> > Thanks Jun. > > Now I use onebox to test kafka, kafka server ip on zk is 127.0.0.1, > network is not affected by external factors. > > Reset connection is not reproed, but I still find Broken pipe exceptions > and a few zk exceptions. > > [2013-03-19 15:23:28,660] INFO Closed socket connection for client / > 127.0.0.1:51902 which had sessionid 0x13d8152007b002c > (org.apache.zookeeper.server.NIOServerCnxn) > [2013-03-19 15:23:28,672] ERROR Unexpected Exception: > (org.apache.zookeeper.server.NIOServerCnxn) > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) > at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77) > at > org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:418) > at > org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1509) > at > org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:171) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:135) > > > > [2013-03-19 15:15:58,355] INFO Closing socket connection to /127.0.0.1. > (kafka.network.Processor) > [2013-03-19 15:16:00,161] INFO Closing socket connection to /127.0.0.1. > (kafka.network.Processor) > [2013-03-19 15:16:01,784] INFO Closing socket connection to /127.0.0.1. > (kafka.network.Processor) > [2013-03-19 15:16:04,751] INFO Closing socket connection to /127.0.0.1. > (kafka.network.Processor) > [2013-03-19 15:16:07,734] ERROR Closing socket for /127.0.0.1 because of > error (kafka.network.Processor) > java.io.IOException: Broken pipe > > at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) > at > sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:456) > at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:557) > at kafka.message.FileMessageSet.writeTo(FileMessageSet.scala:102) > at kafka.server.MessageSetSend.writeTo(MessageSetSend.scala:53) > at kafka.network.MultiSend.writeTo(Transmission.scala:91) > at kafka.network.Processor.write(SocketServer.scala:339) > at kafka.network.Processor.run(SocketServer.scala:216) > at java.lang.Thread.run(Thread.java:679) > > > 2013/3/19 Jun Rao <[EMAIL PROTECTED]> > >> The error you saw on the broker is for consumer requests, not for
-
Re: Connection reset by peerJun Rao 2013-03-19, 16:15
"Connect reset by peer" means the other side of the socket has closed the
connection for some reason. Could you provide the error/exception in both the producer and the broker when a produce request fails? Thanks, Jun On Tue, Mar 19, 2013 at 1:34 AM, Yonghui Zhao <[EMAIL PROTECTED]> wrote: > Connection reset exception reproed. > > [2013-03-19 16:30:45,814] INFO Closing socket connection to /127.0.0.1. > (kafka.network.Processor) > [2013-03-19 16:30:55,253] ERROR Closing socket for /127.0.0.1 because of > error (kafka.network.Processor) > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcher.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251) > at sun.nio.ch.IOUtil.read(IOUtil.java:224) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254) > at kafka.utils.Utils$.read(Utils.scala:538) > at > > kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) > at kafka.network.Processor.read(SocketServer.scala:311) > at kafka.network.Processor.run(SocketServer.scala:214) > at java.lang.Thread.run(Thread.java:679) > [2013-03-19 16:31:02,476] ERROR Closing socket for /127.0.0.1 because of > error (kafka.network.Processor) > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcher.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251) > at sun.nio.ch.IOUtil.read(IOUtil.java:224) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254) > at kafka.utils.Utils$.read(Utils.scala:538) > at > > kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) > at kafka.network.Processor.read(SocketServer.scala:311) > at kafka.network.Processor.run(SocketServer.scala:214) > at java.lang.Thread.run(Thread.java:679) > > > 2013/3/19 Yonghui Zhao <[EMAIL PROTECTED]> > > > Thanks Jun. > > > > Now I use onebox to test kafka, kafka server ip on zk is 127.0.0.1, > > network is not affected by external factors. > > > > Reset connection is not reproed, but I still find Broken pipe exceptions > > and a few zk exceptions. > > > > [2013-03-19 15:23:28,660] INFO Closed socket connection for client / > > 127.0.0.1:51902 which had sessionid 0x13d8152007b002c > > (org.apache.zookeeper.server.NIOServerCnxn) > > [2013-03-19 15:23:28,672] ERROR Unexpected Exception: > > (org.apache.zookeeper.server.NIOServerCnxn) > > java.nio.channels.CancelledKeyException > > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73) > > at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77) > > at > > > org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:418) > > at > > > org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1509) > > at > > > org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:171) > > at > > > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:135) > > > > > > > > [2013-03-19 15:15:58,355] INFO Closing socket connection to /127.0.0.1. > > (kafka.network.Processor) > > [2013-03-19 15:16:00,161] INFO Closing socket connection to /127.0.0.1. > > (kafka.network.Processor) > > [2013-03-19 15:16:01,784] INFO Closing socket connection to /127.0.0.1. > > (kafka.network.Processor) > > [2013-03-19 15:16:04,751] INFO Closing socket connection to /127.0.0.1. > > (kafka.network.Processor) > > [2013-03-19 15:16:07,734] ERROR Closing socket for /127.0.0.1 because of > > error (kafka.network.Processor) > > java.io.IOException: Broken pipe > > > > at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) > > at > > sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:456) > > at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:557)
-
Re: Connection reset by peerYonghui Zhao 2013-03-21, 04:26
Hi Jun,
I didn't find any error in producer log. I did another test, first I injected data to kafka server, then stop producer, and start consumer. The exception still happened, so the exception is not related with producer. From the log below, it seems consumer exception happened first. * Exceptions in consumers:* 2013/03/21* 12:07:17.940 *INFO [SimpleConsumer] [] Reconnect in multifetch due to socket error: java.nio.channels.ClosedByInterruptException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:201) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:281) at kafka.utils.Utils$.read(Utils.scala:538) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) at kafka.network.Receive$class.readCompletely(Transmission.scala:55) at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29) at kafka.consumer.SimpleConsumer.getResponse(SimpleConsumer.scala:177) at kafka.consumer.SimpleConsumer.liftedTree2$1(SimpleConsumer.scala:117) at kafka.consumer.SimpleConsumer.multifetch(SimpleConsumer.scala:115) at kafka.consumer.FetcherRunnable.run(FetcherRunnable.scala:60) 2013/03/21* 12:07:18.176* INFO [SimpleConsumer] [] Reconnect in multifetch due to socket error: java.nio.channels.ClosedByInterruptException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:201) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:281) at kafka.utils.Utils$.read(Utils.scala:538) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:67) at kafka.network.Receive$class.readCompletely(Transmission.scala:55) at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29) at kafka.consumer.SimpleConsumer.getResponse(SimpleConsumer.scala:177) at kafka.consumer.SimpleConsumer.liftedTree2$1(SimpleConsumer.scala:117) at kafka.consumer.SimpleConsumer.multifetch(SimpleConsumer.scala:115) at kafka.consumer.FetcherRunnable.run(FetcherRunnable.scala:60) *Exceptions in kafka server:* [2013-03-21 *12:07:18,128*] ERROR Closing socket for /127.0.0.1 because of error (kafka.network.Processor) java.io.IOException: Connection reset by peer at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:456) at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:557) at kafka.message.FileMessageSet.writeTo(FileMessageSet.scala:102) at kafka.server.MessageSetSend.writeTo(MessageSetSend.scala:53) at kafka.network.MultiSend.writeTo(Transmission.scala:91) at kafka.network.Processor.write(SocketServer.scala:339) at kafka.network.Processor.run(SocketServer.scala:216) at java.lang.Thread.run(Thread.java:679) [2013-03-21 *12:07:19,263*] INFO Socket connection established to localhost/ 127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn) [2013-03-21* 12:07:18,055*] ERROR Closing socket for /127.0.0.1 because of error (kafka.network.Processor) java.io.IOException: Broken pipe at sun.nio.ch.FileDispatcher.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:122) at sun.nio.ch.IOUtil.write(IOUtil.java:93) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:352) at kafka.server.MessageSetSend.writeTo(MessageSetSend.scala:51) at kafka.network.MultiSend.writeTo(Transmission.scala:91) at kafka.network.Processor.write(SocketServer.scala:339) at kafka.network.Processor.run(SocketServer.scala:216) at java.lang.Thread.run(Thread.java:679) * * 2013/3/20 Jun Rao <[EMAIL PROTECTED]> > "Connect reset by peer" means the other side of the socket has closed the
-
Re: Connection reset by peerJun Rao 2013-03-22, 15:06
A typical reason for many rebalancing is the consumer side GC. If so, you
will see logs in the consume saying sth like "expired session" for ZK. Occasional rebalances are fine. Too many rebalances can slow down the consumption and you will need to tune your GC setting. Thanks, Jun On Thu, Mar 21, 2013 at 11:07 PM, Yonghui Zhao <[EMAIL PROTECTED]>wrote: > Yes, before consumer exception: > > 2013/03/21 12:07:17.909 INFO [ZookeeperConsumerConnector] [] > 0_lg-mc-db01.bj-1363784482043-f98c7868 *end rebalancing > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > 2013/03/21 12:07:17.911 INFO [ZookeeperConsumerConnector] [] > 0_lg-mc-db01.bj-1363784482043-f98c7868 *begin rebalancing > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > 2013/03/21 12:07:17.934 INFO [FetcherRunnable] [] FetchRunnable-0 start > fetching topic: sms part: 0 offset: 43667888259 from 127.0.0.1:9093 > 2013/03/21 12:07:17.940 INFO [SimpleConsumer] [] Reconnect in multifetch > due to socket error: > java.nio.channels.*ClosedByInterruptException* > at java.nio.channels.spi.*AbstractInterruptibleChannel* > .end(AbstractInterruptibleChannel.java:201) > > > 2013/03/21 12:07:17.978 INFO [ZookeeperConsumerConnector] [] > 0_lg-mc-db01.bj-1363784482043-f98c7868 *end rebalancing > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > 2013/03/21 12:07:18.004 INFO [FetcherRunnable] [] FetchRunnable-0 start > fetching topic: sms part: 0 offset: 43667888259 from 127.0.0.1:9093 > 2013/03/21 12:07:18.066 INFO [ZookeeperConsumerConnector] [] > 0_lg-mc-db01.bj-1363784482043-f98c7868 *begin rebalancing consume*r > 0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > 2013/03/21 12:07:18.176 INFO [SimpleConsumer] [] Reconnect in multifetch > due to socket error: > java.nio.channels.*ClosedByInterruptException* > at java.nio.channels.spi.*AbstractInterruptibleChannel* > .end(AbstractInterruptibleChannel.java:201) > > > So you think it is normal? How can we avoid this exception? > > I used 4 partitions in kafka, use only 1 partition? > > > > 2013/3/22 Jun Rao <[EMAIL PROTECTED]> > > > Do you see any rebalances in the consumer? Each rebalance will interrupt > > existing fetcher threads first. > > > > Thanks, > > > > Jun > > > > On Thu, Mar 21, 2013 at 9:40 PM, Yonghui Zhao <[EMAIL PROTECTED]> > > wrote: > > > > > The application won't shut down the consumer connector. The consumer > is > > > always alive. > > > > > > 2013/3/22 Jun Rao <[EMAIL PROTECTED]> > > > > > > > If you use the high level consumer, normally > ClosedByInterruptException > > > > happens because the application calls shutdown on the consumer > > connector. > > > > Is that the case? > > > > > > > > Thanks, > > > > > > > > Jun > > > > > > > > On Thu, Mar 21, 2013 at 8:38 PM, Yonghui Zhao <[EMAIL PROTECTED] > > > > > > wrote: > > > > > > > > > No, I use java consumer connector, and set 10 seconds timeout. > > > > > > > > > > ConsumerConfig consumerConfig = new ConsumerConfig(props); > > > > > _consumerConnector = > > > > > Consumer.createJavaConsumerConnector(consumerConfig); > > > > > Map<String, Integer> topicCountMap = new HashMap<String, > > Integer>(); > > > > > topicCountMap.put(_topic, 1); > > > > > Map<String, List<KafkaStream<Message>>> topicMessageStreams = > > > > > _consumerConnector > > > > > .createMessageStreams(topicCountMap); > > > > > List<KafkaStream<Message>> streams = > > > topicMessageStreams.get(_topic); > > > > > KafkaStream<Message> KafkaStream = streams.iterator().next(); > > > > > _consumerIterator = KafkaStream.iterator(); > > > > > > > > > > 2013/3/21 Jun Rao <[EMAIL PROTECTED]> > > > > > > > > > > > So, it seems that your consume thread was interrupted and > therefore > > > the > > > > > > socket channel was closed. Are you using SimpleConsumer? > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Jun > > > > > > > > > > > > On Wed, Mar 20, 2013 at 9:25 PM, Yonghui Zhao < > > [EMAIL PROTECTED] > > > >
-
Re: Connection reset by peerYonghui Zhao 2013-03-22, 15:46
thanks Jun£¡Will tune our GC setting.
Sent from my iPad ÔÚ 2013-3-22£¬23:05£¬Jun Rao <[EMAIL PROTECTED]> дµÀ£º > A typical reason for many rebalancing is the consumer side GC. If so, you > will see logs in the consume saying sth like "expired session" for ZK. > Occasional rebalances are fine. Too many rebalances can slow down the > consumption and you will need to tune your GC setting. > > Thanks, > > Jun > > On Thu, Mar 21, 2013 at 11:07 PM, Yonghui Zhao <[EMAIL PROTECTED]>wrote: > >> Yes, before consumer exception: >> >> 2013/03/21 12:07:17.909 INFO [ZookeeperConsumerConnector] [] >> 0_lg-mc-db01.bj-1363784482043-f98c7868 *end rebalancing >> consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 >> 2013/03/21 12:07:17.911 INFO [ZookeeperConsumerConnector] [] >> 0_lg-mc-db01.bj-1363784482043-f98c7868 *begin rebalancing >> consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 >> 2013/03/21 12:07:17.934 INFO [FetcherRunnable] [] FetchRunnable-0 start >> fetching topic: sms part: 0 offset: 43667888259 from 127.0.0.1:9093 >> 2013/03/21 12:07:17.940 INFO [SimpleConsumer] [] Reconnect in multifetch >> due to socket error: >> java.nio.channels.*ClosedByInterruptException* >> at java.nio.channels.spi.*AbstractInterruptibleChannel* >> .end(AbstractInterruptibleChannel.java:201) >> >> >> 2013/03/21 12:07:17.978 INFO [ZookeeperConsumerConnector] [] >> 0_lg-mc-db01.bj-1363784482043-f98c7868 *end rebalancing >> consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 >> 2013/03/21 12:07:18.004 INFO [FetcherRunnable] [] FetchRunnable-0 start >> fetching topic: sms part: 0 offset: 43667888259 from 127.0.0.1:9093 >> 2013/03/21 12:07:18.066 INFO [ZookeeperConsumerConnector] [] >> 0_lg-mc-db01.bj-1363784482043-f98c7868 *begin rebalancing consume*r >> 0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 >> 2013/03/21 12:07:18.176 INFO [SimpleConsumer] [] Reconnect in multifetch >> due to socket error: >> java.nio.channels.*ClosedByInterruptException* >> at java.nio.channels.spi.*AbstractInterruptibleChannel* >> .end(AbstractInterruptibleChannel.java:201) >> >> >> So you think it is normal? How can we avoid this exception? >> >> I used 4 partitions in kafka, use only 1 partition£¿ >> >> >> >> 2013/3/22 Jun Rao <[EMAIL PROTECTED]> >> >>> Do you see any rebalances in the consumer? Each rebalance will interrupt >>> existing fetcher threads first. >>> >>> Thanks, >>> >>> Jun >>> >>> On Thu, Mar 21, 2013 at 9:40 PM, Yonghui Zhao <[EMAIL PROTECTED]> >>> wrote: >>> >>>> The application won't shut down the consumer connector. The consumer >> is >>>> always alive. >>>> >>>> 2013/3/22 Jun Rao <[EMAIL PROTECTED]> >>>> >>>>> If you use the high level consumer, normally >> ClosedByInterruptException >>>>> happens because the application calls shutdown on the consumer >>> connector. >>>>> Is that the case? >>>>> >>>>> Thanks, >>>>> >>>>> Jun >>>>> >>>>> On Thu, Mar 21, 2013 at 8:38 PM, Yonghui Zhao <[EMAIL PROTECTED] >>> >>>>> wrote: >>>>> >>>>>> No, I use java consumer connector, and set 10 seconds timeout. >>>>>> >>>>>> ConsumerConfig consumerConfig = new ConsumerConfig(props); >>>>>> _consumerConnector = >>>>>> Consumer.createJavaConsumerConnector(consumerConfig); >>>>>> Map<String, Integer> topicCountMap = new HashMap<String, >>> Integer>(); >>>>>> topicCountMap.put(_topic, 1); >>>>>> Map<String, List<KafkaStream<Message>>> topicMessageStreams = >>>>>> _consumerConnector >>>>>> .createMessageStreams(topicCountMap); >>>>>> List<KafkaStream<Message>> streams = >>>> topicMessageStreams.get(_topic); >>>>>> KafkaStream<Message> KafkaStream = streams.iterator().next(); >>>>>> _consumerIterator = KafkaStream.iterator(); >>>>>> >>>>>> 2013/3/21 Jun Rao <[EMAIL PROTECTED]> >>>>>> >>>>>>> So, it seems that your consume thread was interrupted and >> therefore >>>> the >>>>>>> socket channel was closed. Are you using SimpleConsumer? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Jun >>>>>>> >>
-
Re: Connection reset by peerYonghui Zhao 2013-03-25, 06:14
Hi Jun,
I used kafka-server-start.sh to start kafka, there is only one jvm setting "-Xmx512M¡° Do you have some recommend GC setting? Usually our sever has 32GB or 64GB RAM. 2013/3/22 Jun Rao <[EMAIL PROTECTED]> > A typical reason for many rebalancing is the consumer side GC. If so, you > will see logs in the consume saying sth like "expired session" for ZK. > Occasional rebalances are fine. Too many rebalances can slow down the > consumption and you will need to tune your GC setting. > > Thanks, > > Jun > > On Thu, Mar 21, 2013 at 11:07 PM, Yonghui Zhao <[EMAIL PROTECTED] > >wrote: > > > Yes, before consumer exception: > > > > 2013/03/21 12:07:17.909 INFO [ZookeeperConsumerConnector] [] > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *end rebalancing > > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > > 2013/03/21 12:07:17.911 INFO [ZookeeperConsumerConnector] [] > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *begin rebalancing > > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > > 2013/03/21 12:07:17.934 INFO [FetcherRunnable] [] FetchRunnable-0 start > > fetching topic: sms part: 0 offset: 43667888259 from 127.0.0.1:9093 > > 2013/03/21 12:07:17.940 INFO [SimpleConsumer] [] Reconnect in multifetch > > due to socket error: > > java.nio.channels.*ClosedByInterruptException* > > at java.nio.channels.spi.*AbstractInterruptibleChannel* > > .end(AbstractInterruptibleChannel.java:201) > > > > > > 2013/03/21 12:07:17.978 INFO [ZookeeperConsumerConnector] [] > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *end rebalancing > > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > > 2013/03/21 12:07:18.004 INFO [FetcherRunnable] [] FetchRunnable-0 start > > fetching topic: sms part: 0 offset: 43667888259 from 127.0.0.1:9093 > > 2013/03/21 12:07:18.066 INFO [ZookeeperConsumerConnector] [] > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *begin rebalancing consume*r > > 0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > > 2013/03/21 12:07:18.176 INFO [SimpleConsumer] [] Reconnect in multifetch > > due to socket error: > > java.nio.channels.*ClosedByInterruptException* > > at java.nio.channels.spi.*AbstractInterruptibleChannel* > > .end(AbstractInterruptibleChannel.java:201) > > > > > > So you think it is normal? How can we avoid this exception? > > > > I used 4 partitions in kafka, use only 1 partition£¿ > > > > > > > > 2013/3/22 Jun Rao <[EMAIL PROTECTED]> > > > > > Do you see any rebalances in the consumer? Each rebalance will > interrupt > > > existing fetcher threads first. > > > > > > Thanks, > > > > > > Jun > > > > > > On Thu, Mar 21, 2013 at 9:40 PM, Yonghui Zhao <[EMAIL PROTECTED]> > > > wrote: > > > > > > > The application won't shut down the consumer connector. The > consumer > > is > > > > always alive. > > > > > > > > 2013/3/22 Jun Rao <[EMAIL PROTECTED]> > > > > > > > > > If you use the high level consumer, normally > > ClosedByInterruptException > > > > > happens because the application calls shutdown on the consumer > > > connector. > > > > > Is that the case? > > > > > > > > > > Thanks, > > > > > > > > > > Jun > > > > > > > > > > On Thu, Mar 21, 2013 at 8:38 PM, Yonghui Zhao < > [EMAIL PROTECTED] > > > > > > > > wrote: > > > > > > > > > > > No, I use java consumer connector, and set 10 seconds timeout. > > > > > > > > > > > > ConsumerConfig consumerConfig = new ConsumerConfig(props); > > > > > > _consumerConnector = > > > > > > Consumer.createJavaConsumerConnector(consumerConfig); > > > > > > Map<String, Integer> topicCountMap = new HashMap<String, > > > Integer>(); > > > > > > topicCountMap.put(_topic, 1); > > > > > > Map<String, List<KafkaStream<Message>>> topicMessageStreams = > > > > > > _consumerConnector > > > > > > .createMessageStreams(topicCountMap); > > > > > > List<KafkaStream<Message>> streams = > > > > topicMessageStreams.get(_topic); > > > > > > KafkaStream<Message> KafkaStream = streams.iterator().next(); > > > > > > _consumerIterator = KafkaStream.iterator();
-
Re: Connection reset by peerNeha Narkhede 2013-03-25, 13:49
For Kafka 0.7 in production at Linkedin, we use a heap of size 3G, new gen
256 MB, CMS collector with occupancy of 70%. Thanks, Neha On Sunday, March 24, 2013, Yonghui Zhao wrote: > Hi Jun, > > I used kafka-server-start.sh to start kafka, there is only one jvm setting > "-Xmx512M“ > > Do you have some recommend GC setting? Usually our sever has 32GB or 64GB > RAM. > > 2013/3/22 Jun Rao <[EMAIL PROTECTED]> > > > A typical reason for many rebalancing is the consumer side GC. If so, you > > will see logs in the consume saying sth like "expired session" for ZK. > > Occasional rebalances are fine. Too many rebalances can slow down the > > consumption and you will need to tune your GC setting. > > > > Thanks, > > > > Jun > > > > On Thu, Mar 21, 2013 at 11:07 PM, Yonghui Zhao <[EMAIL PROTECTED] > > >wrote: > > > > > Yes, before consumer exception: > > > > > > 2013/03/21 12:07:17.909 INFO [ZookeeperConsumerConnector] [] > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *end rebalancing > > > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > > > 2013/03/21 12:07:17.911 INFO [ZookeeperConsumerConnector] [] > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *begin rebalancing > > > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > > > 2013/03/21 12:07:17.934 INFO [FetcherRunnable] [] FetchRunnable-0 start > > > fetching topic: sms part: 0 offset: 43667888259 from 127.0.0.1:9093 > > > 2013/03/21 12:07:17.940 INFO [SimpleConsumer] [] Reconnect in > multifetch > > > due to socket error: > > > java.nio.channels.*ClosedByInterruptException* > > > at java.nio.channels.spi.*AbstractInterruptibleChannel* > > > .end(AbstractInterruptibleChannel.java:201) > > > > > > > > > 2013/03/21 12:07:17.978 INFO [ZookeeperConsumerConnector] [] > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *end rebalancing > > > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > > > 2013/03/21 12:07:18.004 INFO [FetcherRunnable] [] FetchRunnable-0 start > > > fetching topic: sms part: 0 offset: 43667888259 from 127.0.0.1:9093 > > > 2013/03/21 12:07:18.066 INFO [ZookeeperConsumerConnector] [] > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *begin rebalancing consume*r > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > > > 2013/03/21 12:07:18.176 INFO [SimpleConsumer] [] Reconnect in > multifetch > > > due to socket error: > > > java.nio.channels.*ClosedByInterruptException* > > > at java.nio.channels.spi.*AbstractInterruptibleChannel* > > > .end(AbstractInterruptibleChannel.java:201) > > > > > > > > > So you think it is normal? How can we avoid this exception? > > > > > > I used 4 partitions in kafka, use only 1 partition? > > > > > > > > > > > > 2013/3/22 Jun Rao <[EMAIL PROTECTED]> > > > > > > > Do you see any rebalances in the consumer? Each rebalance will > > interrupt > > > > existing fetcher threads first. > > > > > > > > Thanks, > > > > > > > > Jun > > > > > > > > On Thu, Mar 21, 2013 at 9:40 PM, Yonghui Zhao <[EMAIL PROTECTED] > > > > > > wrote: > > > > > > > > > The application won't shut down the consumer connector. The > > consumer > > > is > > > > > always alive. > > > > > > > > > > 2013/3/22 Jun Rao <[EMAIL PROTECTED]> > > > > > > > > > > > If you use the high level consumer, normally > > > ClosedByInterruptException > > > > > > happens because the application calls shutdown on the consumer > > > > connector. > > > > > > Is that the case? > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Jun > > > > > > > > > > > > On Thu, Mar 21, 2013 at 8:38 PM, Yonghui Zhao < > > [EMAIL PROTECTED] > > > > > > > > > > wrote: > > > > > > > > > > > > > No, I use java consumer connector, and set 10 seconds timeout. > > > > > > > >
-
Re: Connection reset by peerYonghui Zhao 2013-03-26, 02:14
Any suggestion on consumer side?
ÔÚ 2013-3-25 ÏÂÎç9:49£¬"Neha Narkhede" <[EMAIL PROTECTED]>дµÀ£º > For Kafka 0.7 in production at Linkedin, we use a heap of size 3G, new gen > 256 MB, CMS collector with occupancy of 70%. > > Thanks, > Neha > > On Sunday, March 24, 2013, Yonghui Zhao wrote: > > > Hi Jun, > > > > I used kafka-server-start.sh to start kafka, there is only one jvm > setting > > "-Xmx512M¡° > > > > Do you have some recommend GC setting? Usually our sever has 32GB or > 64GB > > RAM. > > > > 2013/3/22 Jun Rao <[EMAIL PROTECTED]> > > > > > A typical reason for many rebalancing is the consumer side GC. If so, > you > > > will see logs in the consume saying sth like "expired session" for ZK. > > > Occasional rebalances are fine. Too many rebalances can slow down the > > > consumption and you will need to tune your GC setting. > > > > > > Thanks, > > > > > > Jun > > > > > > On Thu, Mar 21, 2013 at 11:07 PM, Yonghui Zhao <[EMAIL PROTECTED] > > > >wrote: > > > > > > > Yes, before consumer exception: > > > > > > > > 2013/03/21 12:07:17.909 INFO [ZookeeperConsumerConnector] [] > > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *end rebalancing > > > > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > > > > 2013/03/21 12:07:17.911 INFO [ZookeeperConsumerConnector] [] > > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *begin rebalancing > > > > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > > > > 2013/03/21 12:07:17.934 INFO [FetcherRunnable] [] FetchRunnable-0 > start > > > > fetching topic: sms part: 0 offset: 43667888259 from 127.0.0.1:9093 > > > > 2013/03/21 12:07:17.940 INFO [SimpleConsumer] [] Reconnect in > > multifetch > > > > due to socket error: > > > > java.nio.channels.*ClosedByInterruptException* > > > > at java.nio.channels.spi.*AbstractInterruptibleChannel* > > > > .end(AbstractInterruptibleChannel.java:201) > > > > > > > > > > > > 2013/03/21 12:07:17.978 INFO [ZookeeperConsumerConnector] [] > > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *end rebalancing > > > > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > > > > 2013/03/21 12:07:18.004 INFO [FetcherRunnable] [] FetchRunnable-0 > start > > > > fetching topic: sms part: 0 offset: 43667888259 from 127.0.0.1:9093 > > > > 2013/03/21 12:07:18.066 INFO [ZookeeperConsumerConnector] [] > > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *begin rebalancing consume*r > > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > > > > 2013/03/21 12:07:18.176 INFO [SimpleConsumer] [] Reconnect in > > multifetch > > > > due to socket error: > > > > java.nio.channels.*ClosedByInterruptException* > > > > at java.nio.channels.spi.*AbstractInterruptibleChannel* > > > > .end(AbstractInterruptibleChannel.java:201) > > > > > > > > > > > > So you think it is normal? How can we avoid this exception? > > > > > > > > I used 4 partitions in kafka, use only 1 partition£¿ > > > > > > > > > > > > > > > > 2013/3/22 Jun Rao <[EMAIL PROTECTED]> > > > > > > > > > Do you see any rebalances in the consumer? Each rebalance will > > > interrupt > > > > > existing fetcher threads first. > > > > > > > > > > Thanks, > > > > > > > > > > Jun > > > > > > > > > > On Thu, Mar 21, 2013 at 9:40 PM, Yonghui Zhao < > [EMAIL PROTECTED] > > > > > > > > wrote: > > > > > > > > > > > The application won't shut down the consumer connector. The > > > consumer > > > > is > > > > > > always alive. > > > > > > > > > > > > 2013/3/22 Jun Rao <[EMAIL PROTECTED]> > > > > > > > > > > > > > If you use the high level consumer, normally > > > > ClosedByInterruptException > > > > > > > happens because the application calls shutdown on the consumer > > > > > connector. > > > > > > > Is that the case? > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > Jun > > > > > > > > > > > > > > On Thu, Mar 21, 2013 at 8:38 PM, Yonghui Zhao < > > > [EMAIL PROTECTED] > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > No, I use java consumer connector, and set 10 seconds > timeout.
-
Re: Connection reset by peerNeha Narkhede 2013-03-26, 03:16
That really depends on your consumer application's memory allocation
patterns. If it is a thin wrapper over a Kafka consumer, I would imagine you can get away with using CMS for the tenured generation and parallel collector for the new generation with a small heap like 1gb or so. Thanks, Neha On Monday, March 25, 2013, Yonghui Zhao wrote: > Any suggestion on consumer side? > 在 2013-3-25 下午9:49,"Neha Narkhede" <[EMAIL PROTECTED] <javascript:;> > >写道: > > > For Kafka 0.7 in production at Linkedin, we use a heap of size 3G, new > gen > > 256 MB, CMS collector with occupancy of 70%. > > > > Thanks, > > Neha > > > > On Sunday, March 24, 2013, Yonghui Zhao wrote: > > > > > Hi Jun, > > > > > > I used kafka-server-start.sh to start kafka, there is only one jvm > > setting > > > "-Xmx512M“ > > > > > > Do you have some recommend GC setting? Usually our sever has 32GB or > > 64GB > > > RAM. > > > > > > 2013/3/22 Jun Rao <[EMAIL PROTECTED]> > > > > > > > A typical reason for many rebalancing is the consumer side GC. If so, > > you > > > > will see logs in the consume saying sth like "expired session" for > ZK. > > > > Occasional rebalances are fine. Too many rebalances can slow down the > > > > consumption and you will need to tune your GC setting. > > > > > > > > Thanks, > > > > > > > > Jun > > > > > > > > On Thu, Mar 21, 2013 at 11:07 PM, Yonghui Zhao < > [EMAIL PROTECTED] > > > > >wrote: > > > > > > > > > Yes, before consumer exception: > > > > > > > > > > 2013/03/21 12:07:17.909 INFO [ZookeeperConsumerConnector] [] > > > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *end rebalancing > > > > > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > > > > > 2013/03/21 12:07:17.911 INFO [ZookeeperConsumerConnector] [] > > > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *begin rebalancing > > > > > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > > > > > 2013/03/21 12:07:17.934 INFO [FetcherRunnable] [] FetchRunnable-0 > > start > > > > > fetching topic: sms part: 0 offset: 43667888259 from > 127.0.0.1:9093 > > > > > 2013/03/21 12:07:17.940 INFO [SimpleConsumer] [] Reconnect in > > > multifetch > > > > > due to socket error: > > > > > java.nio.channels.*ClosedByInterruptException* > > > > > at java.nio.channels.spi.*AbstractInterruptibleChannel* > > > > > .end(AbstractInterruptibleChannel.java:201) > > > > > > > > > > > > > > > 2013/03/21 12:07:17.978 INFO [ZookeeperConsumerConnector] [] > > > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *end rebalancing > > > > > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > > > > > 2013/03/21 12:07:18.004 INFO [FetcherRunnable] [] FetchRunnable-0 > > start > > > > > fetching topic: sms part: 0 offset: 43667888259 from > 127.0.0.1:9093 > > > > > 2013/03/21 12:07:18.066 INFO [ZookeeperConsumerConnector] [] > > > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *begin rebalancing consume*r > > > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > > > > > 2013/03/21 12:07:18.176 INFO [SimpleConsumer] [] Reconnect in > > > multifetch > > > > > due to socket error: > > > > > java.nio.channels.*ClosedByInterruptException* > > > > > at java.nio.channels.spi.*AbstractInterruptibleChannel* > > > > > .end(AbstractInterruptibleChannel.java:201) > > > > > > > > > > > > > > > So you think it is normal? How can we avoid this exception? > > > > > > > > > > I used 4 partitions in kafka, use only 1 partition? > > > > > > > > > > > > > > > > > > > > 2013/3/22 Jun Rao <[EMAIL PROTECTED]> > > > > > > > > > > > Do you see any rebalances in the consumer? Each rebalance will > > > > interrupt > > > > > > existing fetcher threads first. > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Jun > > > > > > > > > > > > On Thu, Mar 21, 2013 at 9:40 PM, Yonghui Zhao < > > [EMAIL PROTECTED] >
-
Re: Connection reset by peerYonghui Zhao 2013-03-26, 03:49
Thanks Neha, btw have you seen this exception. We didn't restart any
service it happens in deep night. java.lang.RuntimeException: A broker is already registered on the path /brokers/ids/0. This probably indicates that you either have configured a brokerid that is already in use, or else you have shutdown this broker and restarted it faster than the zookeeper timeout so it appears to be re-registering. at kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:57) at kafka.server.KafkaZooKeeper$SessionExpireListener.handleNewSession(KafkaZooKeeper.scala:100) at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472) at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) [2013-03-26 02:07:19,155] INFO re-registering broker info in ZK for broker 0 (kafka.server.KafkaZooKeeper) [2013-03-26 02:07:19,155] INFO Registering broker /brokers/ids/0 (kafka.server.KafkaZooKeeper) [2013-03-26 02:07:19,611] INFO conflict in /brokers/ids/0 data: 127.0.0.1-1364234839275:127.0.0.1:9093 stored data: 127.0.0.1-1364227372971: 127.0.0.1:9093 (kafka.utils.ZkUtils$) [2013-03-26 02:07:19,611] ERROR Error handling event ZkEvent[New session event sent to kafka.server.KafkaZooKeeper$SessionExpireListener@40f8c9bf] (org.I0Itec.zkclient.ZkEventThread) java.lang.RuntimeException: A broker is already registered on the path /brokers/ids/0. This probably indicates that you either have configured a brokerid that is already in use, or else you have shutdown this broker and restarted it faster than the zookeeper timeout so it appears to be re-registering. at kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:57) at kafka.server.KafkaZooKeeper$SessionExpireListener.handleNewSession(KafkaZooKeeper.scala:100) at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472) at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) 2013/3/26 Neha Narkhede <[EMAIL PROTECTED]> > That really depends on your consumer application's memory allocation > patterns. If it is a thin wrapper over a Kafka consumer, I would imagine > you can get away with using CMS for the tenured generation and parallel > collector for the new generation with a small heap like 1gb or so. > > Thanks, > Neha > > On Monday, March 25, 2013, Yonghui Zhao wrote: > > > Any suggestion on consumer side? > > ÔÚ 2013-3-25 ÏÂÎç9:49£¬"Neha Narkhede" <[EMAIL PROTECTED]<javascript:;> > > >дµÀ£º > > > > > For Kafka 0.7 in production at Linkedin, we use a heap of size 3G, new > > gen > > > 256 MB, CMS collector with occupancy of 70%. > > > > > > Thanks, > > > Neha > > > > > > On Sunday, March 24, 2013, Yonghui Zhao wrote: > > > > > > > Hi Jun, > > > > > > > > I used kafka-server-start.sh to start kafka, there is only one jvm > > > setting > > > > "-Xmx512M¡° > > > > > > > > Do you have some recommend GC setting? Usually our sever has 32GB > or > > > 64GB > > > > RAM. > > > > > > > > 2013/3/22 Jun Rao <[EMAIL PROTECTED]> > > > > > > > > > A typical reason for many rebalancing is the consumer side GC. If > so, > > > you > > > > > will see logs in the consume saying sth like "expired session" for > > ZK. > > > > > Occasional rebalances are fine. Too many rebalances can slow down > the > > > > > consumption and you will need to tune your GC setting. > > > > > > > > > > Thanks, > > > > > > > > > > Jun > > > > > > > > > > On Thu, Mar 21, 2013 at 11:07 PM, Yonghui Zhao < > > [EMAIL PROTECTED] > > > > > >wrote: > > > > > > > > > > > Yes, before consumer exception: > > > > > > > > > > > > 2013/03/21 12:07:17.909 INFO [ZookeeperConsumerConnector] [] > > > > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *end rebalancing > > > > > > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > > > > > > 2013/03/21 12:07:17.911 INFO [ZookeeperConsumerConnector] [] > > > > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *begin rebalancing > > > > > > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0 > > > > > > 2013/03/21 12:07:17.934 INFO [FetcherRunnable] [] FetchRunnable-0
-
Re: Connection reset by peerNeha Narkhede 2013-03-26, 15:40
Did you have a gc pause around that time on the server ? What are your
server's current gc settings ? Thanks, Neha On Mon, Mar 25, 2013 at 8:48 PM, Yonghui Zhao <[EMAIL PROTECTED]> wrote: > Thanks Neha, btw have you seen this exception. We didn't restart any > service it happens in deep night. > > java.lang.RuntimeException: A broker is already registered on the path > /brokers/ids/0. This probably indicates that you either have configured a > brokerid that is already in use, or else you have shutdown this broker and > restarted it faster than the zookeeper timeout so it appears to be > re-registering. > at > kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:57) > at > kafka.server.KafkaZooKeeper$SessionExpireListener.handleNewSession(KafkaZooKeeper.scala:100) > at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472) > at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) > [2013-03-26 02:07:19,155] INFO re-registering broker info in ZK for broker > 0 (kafka.server.KafkaZooKeeper) > [2013-03-26 02:07:19,155] INFO Registering broker /brokers/ids/0 > (kafka.server.KafkaZooKeeper) > [2013-03-26 02:07:19,611] INFO conflict in /brokers/ids/0 data: > 127.0.0.1-1364234839275:127.0.0.1:9093 stored data: 127.0.0.1-1364227372971: > 127.0.0.1:9093 (kafka.utils.ZkUtils$) > [2013-03-26 02:07:19,611] ERROR Error handling event ZkEvent[New session > event sent to kafka.server.KafkaZooKeeper$SessionExpireListener@40f8c9bf] > (org.I0Itec.zkclient.ZkEventThread) > java.lang.RuntimeException: A broker is already registered on the path > /brokers/ids/0. This probably indicates that you either have configured a > brokerid that is already in use, or else you have shutdown this broker and > restarted it faster than the zookeeper timeout so it appears to be > re-registering. > at > kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:57) > at > kafka.server.KafkaZooKeeper$SessionExpireListener.handleNewSession(KafkaZooKeeper.scala:100) > at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472) > at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) > > > > 2013/3/26 Neha Narkhede <[EMAIL PROTECTED]> > >> That really depends on your consumer application's memory allocation >> patterns. If it is a thin wrapper over a Kafka consumer, I would imagine >> you can get away with using CMS for the tenured generation and parallel >> collector for the new generation with a small heap like 1gb or so. >> >> Thanks, >> Neha >> >> On Monday, March 25, 2013, Yonghui Zhao wrote: >> >> > Any suggestion on consumer side? >> > 在 2013-3-25 下午9:49,"Neha Narkhede" <[EMAIL PROTECTED]<javascript:;> >> > >写道: >> > >> > > For Kafka 0.7 in production at Linkedin, we use a heap of size 3G, new >> > gen >> > > 256 MB, CMS collector with occupancy of 70%. >> > > >> > > Thanks, >> > > Neha >> > > >> > > On Sunday, March 24, 2013, Yonghui Zhao wrote: >> > > >> > > > Hi Jun, >> > > > >> > > > I used kafka-server-start.sh to start kafka, there is only one jvm >> > > setting >> > > > "-Xmx512M“ >> > > > >> > > > Do you have some recommend GC setting? Usually our sever has 32GB >> or >> > > 64GB >> > > > RAM. >> > > > >> > > > 2013/3/22 Jun Rao <[EMAIL PROTECTED]> >> > > > >> > > > > A typical reason for many rebalancing is the consumer side GC. If >> so, >> > > you >> > > > > will see logs in the consume saying sth like "expired session" for >> > ZK. >> > > > > Occasional rebalances are fine. Too many rebalances can slow down >> the >> > > > > consumption and you will need to tune your GC setting. >> > > > > >> > > > > Thanks, >> > > > > >> > > > > Jun >> > > > > >> > > > > On Thu, Mar 21, 2013 at 11:07 PM, Yonghui Zhao < >> > [EMAIL PROTECTED] >> > > > > >wrote: >> > > > > >> > > > > > Yes, before consumer exception: >> > > > > > >> > > > > > 2013/03/21 12:07:17.909 INFO [ZookeeperConsumerConnector] [] >> > > > > > 0_lg-mc-db01.bj-1363784482043-f98c7868 *end rebalancing >> > > > > > consumer*0_lg-mc-db01.bj-1363784482043-f98c7868 try #0
-
Re: Connection reset by peerYonghui Zhao 2013-03-26, 16:48
kafka server is started by bin/kafka-server-start.sh. No gc setting.
在 2013-3-26 下午11:40,"Neha Narkhede" <[EMAIL PROTECTED]>写道: > Did you have a gc pause around that time on the server ? What are your > server's current gc settings ? > > Thanks, > Neha > > On Mon, Mar 25, 2013 at 8:48 PM, Yonghui Zhao <[EMAIL PROTECTED]> > wrote: > > Thanks Neha, btw have you seen this exception. We didn't restart any > > service it happens in deep night. > > > > java.lang.RuntimeException: A broker is already registered on the path > > /brokers/ids/0. This probably indicates that you either have configured a > > brokerid that is already in use, or else you have shutdown this broker > and > > restarted it faster than the zookeeper timeout so it appears to be > > re-registering. > > at > > kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:57) > > at > > > kafka.server.KafkaZooKeeper$SessionExpireListener.handleNewSession(KafkaZooKeeper.scala:100) > > at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472) > > at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) > > [2013-03-26 02:07:19,155] INFO re-registering broker info in ZK for > broker > > 0 (kafka.server.KafkaZooKeeper) > > [2013-03-26 02:07:19,155] INFO Registering broker /brokers/ids/0 > > (kafka.server.KafkaZooKeeper) > > [2013-03-26 02:07:19,611] INFO conflict in /brokers/ids/0 data: > > 127.0.0.1-1364234839275:127.0.0.1:9093 stored data: > 127.0.0.1-1364227372971: > > 127.0.0.1:9093 (kafka.utils.ZkUtils$) > > [2013-03-26 02:07:19,611] ERROR Error handling event ZkEvent[New session > > event sent to kafka.server.KafkaZooKeeper$SessionExpireListener@40f8c9bf > ] > > (org.I0Itec.zkclient.ZkEventThread) > > java.lang.RuntimeException: A broker is already registered on the path > > /brokers/ids/0. This probably indicates that you either have configured a > > brokerid that is already in use, or else you have shutdown this broker > and > > restarted it faster than the zookeeper timeout so it appears to be > > re-registering. > > at > > kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:57) > > at > > > kafka.server.KafkaZooKeeper$SessionExpireListener.handleNewSession(KafkaZooKeeper.scala:100) > > at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472) > > at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) > > > > > > > > 2013/3/26 Neha Narkhede <[EMAIL PROTECTED]> > > > >> That really depends on your consumer application's memory allocation > >> patterns. If it is a thin wrapper over a Kafka consumer, I would imagine > >> you can get away with using CMS for the tenured generation and parallel > >> collector for the new generation with a small heap like 1gb or so. > >> > >> Thanks, > >> Neha > >> > >> On Monday, March 25, 2013, Yonghui Zhao wrote: > >> > >> > Any suggestion on consumer side? > >> > 在 2013-3-25 下午9:49,"Neha Narkhede" <[EMAIL PROTECTED] > <javascript:;> > >> > >写道: > >> > > >> > > For Kafka 0.7 in production at Linkedin, we use a heap of size 3G, > new > >> > gen > >> > > 256 MB, CMS collector with occupancy of 70%. > >> > > > >> > > Thanks, > >> > > Neha > >> > > > >> > > On Sunday, March 24, 2013, Yonghui Zhao wrote: > >> > > > >> > > > Hi Jun, > >> > > > > >> > > > I used kafka-server-start.sh to start kafka, there is only one jvm > >> > > setting > >> > > > "-Xmx512M“ > >> > > > > >> > > > Do you have some recommend GC setting? Usually our sever has > 32GB > >> or > >> > > 64GB > >> > > > RAM. > >> > > > > >> > > > 2013/3/22 Jun Rao <[EMAIL PROTECTED]> > >> > > > > >> > > > > A typical reason for many rebalancing is the consumer side GC. > If > >> so, > >> > > you > >> > > > > will see logs in the consume saying sth like "expired session" > for > >> > ZK. > >> > > > > Occasional rebalances are fine. Too many rebalances can slow > down > >> the > >> > > > > consumption and you will need to tune your GC setting. > >> > > > > > >> > > > > Thanks, > >> > > >
-
Re: Connection reset by peerNeha Narkhede 2013-03-27, 14:52
The kafka-server-start.sh script doesn't have the mentioned GC
settings and heap size configured. However, probably doing that is a good idea. Thanks, Neha On Tue, Mar 26, 2013 at 9:47 AM, Yonghui Zhao <[EMAIL PROTECTED]> wrote: > kafka server is started by bin/kafka-server-start.sh. No gc setting. > 在 2013-3-26 下午11:40,"Neha Narkhede" <[EMAIL PROTECTED]>写道: > >> Did you have a gc pause around that time on the server ? What are your >> server's current gc settings ? >> >> Thanks, >> Neha >> >> On Mon, Mar 25, 2013 at 8:48 PM, Yonghui Zhao <[EMAIL PROTECTED]> >> wrote: >> > Thanks Neha, btw have you seen this exception. We didn't restart any >> > service it happens in deep night. >> > >> > java.lang.RuntimeException: A broker is already registered on the path >> > /brokers/ids/0. This probably indicates that you either have configured a >> > brokerid that is already in use, or else you have shutdown this broker >> and >> > restarted it faster than the zookeeper timeout so it appears to be >> > re-registering. >> > at >> > kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:57) >> > at >> > >> kafka.server.KafkaZooKeeper$SessionExpireListener.handleNewSession(KafkaZooKeeper.scala:100) >> > at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472) >> > at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) >> > [2013-03-26 02:07:19,155] INFO re-registering broker info in ZK for >> broker >> > 0 (kafka.server.KafkaZooKeeper) >> > [2013-03-26 02:07:19,155] INFO Registering broker /brokers/ids/0 >> > (kafka.server.KafkaZooKeeper) >> > [2013-03-26 02:07:19,611] INFO conflict in /brokers/ids/0 data: >> > 127.0.0.1-1364234839275:127.0.0.1:9093 stored data: >> 127.0.0.1-1364227372971: >> > 127.0.0.1:9093 (kafka.utils.ZkUtils$) >> > [2013-03-26 02:07:19,611] ERROR Error handling event ZkEvent[New session >> > event sent to kafka.server.KafkaZooKeeper$SessionExpireListener@40f8c9bf >> ] >> > (org.I0Itec.zkclient.ZkEventThread) >> > java.lang.RuntimeException: A broker is already registered on the path >> > /brokers/ids/0. This probably indicates that you either have configured a >> > brokerid that is already in use, or else you have shutdown this broker >> and >> > restarted it faster than the zookeeper timeout so it appears to be >> > re-registering. >> > at >> > kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:57) >> > at >> > >> kafka.server.KafkaZooKeeper$SessionExpireListener.handleNewSession(KafkaZooKeeper.scala:100) >> > at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472) >> > at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) >> > >> > >> > >> > 2013/3/26 Neha Narkhede <[EMAIL PROTECTED]> >> > >> >> That really depends on your consumer application's memory allocation >> >> patterns. If it is a thin wrapper over a Kafka consumer, I would imagine >> >> you can get away with using CMS for the tenured generation and parallel >> >> collector for the new generation with a small heap like 1gb or so. >> >> >> >> Thanks, >> >> Neha >> >> >> >> On Monday, March 25, 2013, Yonghui Zhao wrote: >> >> >> >> > Any suggestion on consumer side? >> >> > 在 2013-3-25 下午9:49,"Neha Narkhede" <[EMAIL PROTECTED] >> <javascript:;> >> >> > >写道: >> >> > >> >> > > For Kafka 0.7 in production at Linkedin, we use a heap of size 3G, >> new >> >> > gen >> >> > > 256 MB, CMS collector with occupancy of 70%. >> >> > > >> >> > > Thanks, >> >> > > Neha >> >> > > >> >> > > On Sunday, March 24, 2013, Yonghui Zhao wrote: >> >> > > >> >> > > > Hi Jun, >> >> > > > >> >> > > > I used kafka-server-start.sh to start kafka, there is only one jvm >> >> > > setting >> >> > > > "-Xmx512M“ >> >> > > > >> >> > > > Do you have some recommend GC setting? Usually our sever has >> 32GB >> >> or >> >> > > 64GB >> >> > > > RAM. >> >> > > > >> >> > > > 2013/3/22 Jun Rao <[EMAIL PROTECTED]> >> >> > > > >> >> > > > > A typical reason for many rebalancing is the consumer side GC.
-
Re: Connection reset by peerYonghui Zhao 2013-03-28, 03:35
Now I used GC like this:
-server -Xms1536m -Xmx1536m -XX:NewSize=128m -XX:MaxNewSize=128m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=70 But it still happened. It seems kafka server reconnect with zk, but the old node was still there. So kafka server stopped. Can kafka server retry to connect with zk? 2013-03-27 22:15:03,529] INFO Opening socket connection to server localhost/ 127.0.0.1:2181 (org.apache.zookeeper.ClientCnxn) [2013-03-27 22:15:03,529] INFO Socket connection established to localhost/ 127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn) [2013-03-27 22:15:05,855] INFO Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x13da6d94abf00aa, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn) [2013-03-27 22:15:05,942] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient) [2013-03-27 22:15:14,912] INFO conflict in /brokers/ids/0 data: 127.0.0.1-1364393691770:127.0.0.1:9093 stored data: null (kafka.utils.ZkUtils$) [2013-03-27 22:15:14,942] ERROR Error handling event ZkEvent[New session event sent to kafka.server.KafkaZooKeeper$SessionExpireListener@18f389bc] (org.I0Itec.zkclient.ZkEventThread) java.lang.RuntimeException: A broker is already registered on the path /brokers/ids/0. This probably indicates that you either have configured a brokerid that is already in use, or else you have shutdown this broker and restarted it faster than the zookeeper timeout so it appears to be re-registering. at kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:57) at kafka.server.KafkaZooKeeper$SessionExpireListener.handleNewSession(KafkaZooKeeper.scala:100) at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472) at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) [2013-03-27 22:15:33,736] INFO Closing socket connection to /127.0.0.1. (kafka.network.Processor) 2013/3/27 Neha Narkhede <[EMAIL PROTECTED]> > The kafka-server-start.sh script doesn't have the mentioned GC > settings and heap size configured. However, probably doing that is a > good idea. > > Thanks, > Neha > > On Tue, Mar 26, 2013 at 9:47 AM, Yonghui Zhao <[EMAIL PROTECTED]> > wrote: > > kafka server is started by bin/kafka-server-start.sh. No gc setting. > > 在 2013-3-26 下午11:40,"Neha Narkhede" <[EMAIL PROTECTED]>写道: > > > >> Did you have a gc pause around that time on the server ? What are your > >> server's current gc settings ? > >> > >> Thanks, > >> Neha > >> > >> On Mon, Mar 25, 2013 at 8:48 PM, Yonghui Zhao <[EMAIL PROTECTED]> > >> wrote: > >> > Thanks Neha, btw have you seen this exception. We didn't restart any > >> > service it happens in deep night. > >> > > >> > java.lang.RuntimeException: A broker is already registered on the path > >> > /brokers/ids/0. This probably indicates that you either have > configured a > >> > brokerid that is already in use, or else you have shutdown this broker > >> and > >> > restarted it faster than the zookeeper timeout so it appears to be > >> > re-registering. > >> > at > >> > > kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:57) > >> > at > >> > > >> > kafka.server.KafkaZooKeeper$SessionExpireListener.handleNewSession(KafkaZooKeeper.scala:100) > >> > at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472) > >> > at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) > >> > [2013-03-26 02:07:19,155] INFO re-registering broker info in ZK for > >> broker > >> > 0 (kafka.server.KafkaZooKeeper) > >> > [2013-03-26 02:07:19,155] INFO Registering broker /brokers/ids/0 > >> > (kafka.server.KafkaZooKeeper) > >> > [2013-03-26 02:07:19,611] INFO conflict in /brokers/ids/0 data: > >> > 127.0.0.1-1364234839275:127.0.0.1:9093 stored data: > >> 127.0.0.1-1364227372971: > >> > 127.0.0.1:9093 (kafka.utils.ZkUtils$) > >> > [2013-03-26 02:07:19,611] ERROR Error handling event ZkEvent[New > session > >> > event sent to > kafka.server.KafkaZooKeeper$SessionExpireListener@40f8c9bf
-
Re: Connection reset by peerJun Rao 2013-03-28, 04:54
Not sure why the re-registration fails. Are you using ZK 3.3.4 or above?
It seems that you consumer still GCs, which is the root cause. So, you will need to tune the GC setting further. Another way to avoid ZK session timeout is to increase the session timeout config. Thanks, Jun On Wed, Mar 27, 2013 at 8:35 PM, Yonghui Zhao <[EMAIL PROTECTED]> wrote: > Now I used GC like this: > > -server -Xms1536m -Xmx1536m -XX:NewSize=128m -XX:MaxNewSize=128m > -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > -XX:CMSInitiatingOccupancyFraction=70 > > > But it still happened. It seems kafka server reconnect with zk, but the > old node was still there. So kafka server stopped. > Can kafka server retry to connect with zk? > > > 2013-03-27 22:15:03,529] INFO Opening socket connection to server > localhost/ > 127.0.0.1:2181 (org.apache.zookeeper.ClientCnxn) > [2013-03-27 22:15:03,529] INFO Socket connection established to localhost/ > 127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn) > [2013-03-27 22:15:05,855] INFO Session establishment complete on server > localhost/127.0.0.1:2181, sessionid = 0x13da6d94abf00aa, negotiated > timeout > = 6000 (org.apache.zookeeper.ClientCnxn) > [2013-03-27 22:15:05,942] INFO zookeeper state changed (SyncConnected) > (org.I0Itec.zkclient.ZkClient) > [2013-03-27 22:15:14,912] INFO conflict in /brokers/ids/0 data: > 127.0.0.1-1364393691770:127.0.0.1:9093 stored data: null > (kafka.utils.ZkUtils$) > [2013-03-27 22:15:14,942] ERROR Error handling event ZkEvent[New session > event sent to kafka.server.KafkaZooKeeper$SessionExpireListener@18f389bc] > (org.I0Itec.zkclient.ZkEventThread) > java.lang.RuntimeException: A broker is already registered on the path > /brokers/ids/0. This probably indicates that you either have configured a > brokerid that is already in use, or else you have shutdown this broker and > restarted it faster than the zookeeper timeout so it appears to be > re-registering. > at > kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:57) > at > > kafka.server.KafkaZooKeeper$SessionExpireListener.handleNewSession(KafkaZooKeeper.scala:100) > at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472) > at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) > [2013-03-27 22:15:33,736] INFO Closing socket connection to /127.0.0.1. > (kafka.network.Processor) > > > > > > 2013/3/27 Neha Narkhede <[EMAIL PROTECTED]> > > > The kafka-server-start.sh script doesn't have the mentioned GC > > settings and heap size configured. However, probably doing that is a > > good idea. > > > > Thanks, > > Neha > > > > On Tue, Mar 26, 2013 at 9:47 AM, Yonghui Zhao <[EMAIL PROTECTED]> > > wrote: > > > kafka server is started by bin/kafka-server-start.sh. No gc setting. > > > 在 2013-3-26 下午11:40,"Neha Narkhede" <[EMAIL PROTECTED]>写道: > > > > > >> Did you have a gc pause around that time on the server ? What are your > > >> server's current gc settings ? > > >> > > >> Thanks, > > >> Neha > > >> > > >> On Mon, Mar 25, 2013 at 8:48 PM, Yonghui Zhao <[EMAIL PROTECTED]> > > >> wrote: > > >> > Thanks Neha, btw have you seen this exception. We didn't restart > any > > >> > service it happens in deep night. > > >> > > > >> > java.lang.RuntimeException: A broker is already registered on the > path > > >> > /brokers/ids/0. This probably indicates that you either have > > configured a > > >> > brokerid that is already in use, or else you have shutdown this > broker > > >> and > > >> > restarted it faster than the zookeeper timeout so it appears to be > > >> > re-registering. > > >> > at > > >> > > > kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:57) > > >> > at > > >> > > > >> > > > kafka.server.KafkaZooKeeper$SessionExpireListener.handleNewSession(KafkaZooKeeper.scala:100) > > >> > at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472) > > >> > at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) > > >> > [2013-03-26 02:07:19,155] INFO re-registering broker info in ZK for
-
Re: Connection reset by peerYonghui Zhao 2013-03-28, 07:23
I used zookeeper-3.3.4 in kafka.
Default tickTime is 3 seconds, minSesstionTimeOut is 6 seconds. Now I change tickTime to 5 seconds. minSesstionTimeOut to 10 seconds But if we change timeout to a larger one, "you have shutdown this broker and restarted it faster than the zookeeper timeout so it appears to be re-registering." this could happened more easily Do you think consumer GC will affect kafka server and zk connection? 2013/3/28 Jun Rao <[EMAIL PROTECTED]> > Not sure why the re-registration fails. Are you using ZK 3.3.4 or above? > > It seems that you consumer still GCs, which is the root cause. So, you will > need to tune the GC setting further. Another way to avoid ZK session > timeout is to increase the session timeout config. > > Thanks, > > Jun > > On Wed, Mar 27, 2013 at 8:35 PM, Yonghui Zhao <[EMAIL PROTECTED]> > wrote: > > > Now I used GC like this: > > > > -server -Xms1536m -Xmx1536m -XX:NewSize=128m -XX:MaxNewSize=128m > > -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > > -XX:CMSInitiatingOccupancyFraction=70 > > > > > > But it still happened. It seems kafka server reconnect with zk, but the > > old node was still there. So kafka server stopped. > > Can kafka server retry to connect with zk? > > > > > > 2013-03-27 22:15:03,529] INFO Opening socket connection to server > > localhost/ > > 127.0.0.1:2181 (org.apache.zookeeper.ClientCnxn) > > [2013-03-27 22:15:03,529] INFO Socket connection established to > localhost/ > > 127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn) > > [2013-03-27 22:15:05,855] INFO Session establishment complete on server > > localhost/127.0.0.1:2181, sessionid = 0x13da6d94abf00aa, negotiated > > timeout > > = 6000 (org.apache.zookeeper.ClientCnxn) > > [2013-03-27 22:15:05,942] INFO zookeeper state changed (SyncConnected) > > (org.I0Itec.zkclient.ZkClient) > > [2013-03-27 22:15:14,912] INFO conflict in /brokers/ids/0 data: > > 127.0.0.1-1364393691770:127.0.0.1:9093 stored data: null > > (kafka.utils.ZkUtils$) > > [2013-03-27 22:15:14,942] ERROR Error handling event ZkEvent[New session > > event sent to kafka.server.KafkaZooKeeper$SessionExpireListener@18f389bc > ] > > (org.I0Itec.zkclient.ZkEventThread) > > java.lang.RuntimeException: A broker is already registered on the path > > /brokers/ids/0. This probably indicates that you either have configured a > > brokerid that is already in use, or else you have shutdown this broker > and > > restarted it faster than the zookeeper timeout so it appears to be > > re-registering. > > at > > kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:57) > > at > > > > > kafka.server.KafkaZooKeeper$SessionExpireListener.handleNewSession(KafkaZooKeeper.scala:100) > > at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472) > > at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) > > [2013-03-27 22:15:33,736] INFO Closing socket connection to /127.0.0.1. > > (kafka.network.Processor) > > > > > > > > > > > > 2013/3/27 Neha Narkhede <[EMAIL PROTECTED]> > > > > > The kafka-server-start.sh script doesn't have the mentioned GC > > > settings and heap size configured. However, probably doing that is a > > > good idea. > > > > > > Thanks, > > > Neha > > > > > > On Tue, Mar 26, 2013 at 9:47 AM, Yonghui Zhao <[EMAIL PROTECTED]> > > > wrote: > > > > kafka server is started by bin/kafka-server-start.sh. No gc setting. > > > > 在 2013-3-26 下午11:40,"Neha Narkhede" <[EMAIL PROTECTED]>写道: > > > > > > > >> Did you have a gc pause around that time on the server ? What are > your > > > >> server's current gc settings ? > > > >> > > > >> Thanks, > > > >> Neha > > > >> > > > >> On Mon, Mar 25, 2013 at 8:48 PM, Yonghui Zhao < > [EMAIL PROTECTED]> > > > >> wrote: > > > >> > Thanks Neha, btw have you seen this exception. We didn't restart > > any > > > >> > service it happens in deep night. > > > >> > > > > >> > java.lang.RuntimeException: A broker is already registered on the > > path > > > >> > /brokers/ids/0. This probably indicates that you either have
-
Re: Connection reset by peerJun Rao 2013-03-28, 14:52
The zk session timeout only kicks in if you force kill the consumer.
Otherwise, consumer will close ZK session properly on clean shutdown. The problem with GC is that if the consumer pauses for a long time, ZK server won't receive pings from the client and thus can expire a still existing session. The best thing to do here is to fix the GC issue since it may have other implications. To start with, you probably want to enable GC logging and see how long and how frequent your GCs are. Thanks, Jun On Thu, Mar 28, 2013 at 12:23 AM, Yonghui Zhao <[EMAIL PROTECTED]>wrote: > I used zookeeper-3.3.4 in kafka. > > Default tickTime is 3 seconds, minSesstionTimeOut is 6 seconds. > Now I change tickTime to 5 seconds. minSesstionTimeOut to 10 seconds > But if we change timeout to a larger one, > "you have shutdown this broker and restarted it faster than the zookeeper > timeout so it appears to be re-registering." > this could happened more easily > > Do you think consumer GC will affect kafka server and zk connection? > > > > 2013/3/28 Jun Rao <[EMAIL PROTECTED]> > > > Not sure why the re-registration fails. Are you using ZK 3.3.4 or above? > > > > It seems that you consumer still GCs, which is the root cause. So, you > will > > need to tune the GC setting further. Another way to avoid ZK session > > timeout is to increase the session timeout config. > > > > Thanks, > > > > Jun > > > > On Wed, Mar 27, 2013 at 8:35 PM, Yonghui Zhao <[EMAIL PROTECTED]> > > wrote: > > > > > Now I used GC like this: > > > > > > -server -Xms1536m -Xmx1536m -XX:NewSize=128m -XX:MaxNewSize=128m > > > -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > > > -XX:CMSInitiatingOccupancyFraction=70 > > > > > > > > > But it still happened. It seems kafka server reconnect with zk, but > the > > > old node was still there. So kafka server stopped. > > > Can kafka server retry to connect with zk? > > > > > > > > > 2013-03-27 22:15:03,529] INFO Opening socket connection to server > > > localhost/ > > > 127.0.0.1:2181 (org.apache.zookeeper.ClientCnxn) > > > [2013-03-27 22:15:03,529] INFO Socket connection established to > > localhost/ > > > 127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn) > > > [2013-03-27 22:15:05,855] INFO Session establishment complete on server > > > localhost/127.0.0.1:2181, sessionid = 0x13da6d94abf00aa, negotiated > > > timeout > > > = 6000 (org.apache.zookeeper.ClientCnxn) > > > [2013-03-27 22:15:05,942] INFO zookeeper state changed (SyncConnected) > > > (org.I0Itec.zkclient.ZkClient) > > > [2013-03-27 22:15:14,912] INFO conflict in /brokers/ids/0 data: > > > 127.0.0.1-1364393691770:127.0.0.1:9093 stored data: null > > > (kafka.utils.ZkUtils$) > > > [2013-03-27 22:15:14,942] ERROR Error handling event ZkEvent[New > session > > > event sent to > kafka.server.KafkaZooKeeper$SessionExpireListener@18f389bc > > ] > > > (org.I0Itec.zkclient.ZkEventThread) > > > java.lang.RuntimeException: A broker is already registered on the path > > > /brokers/ids/0. This probably indicates that you either have > configured a > > > brokerid that is already in use, or else you have shutdown this broker > > and > > > restarted it faster than the zookeeper timeout so it appears to be > > > re-registering. > > > at > > > kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:57) > > > at > > > > > > > > > kafka.server.KafkaZooKeeper$SessionExpireListener.handleNewSession(KafkaZooKeeper.scala:100) > > > at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472) > > > at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) > > > [2013-03-27 22:15:33,736] INFO Closing socket connection to /127.0.0.1 > . > > > (kafka.network.Processor) > > > > > > > > > > > > > > > > > > 2013/3/27 Neha Narkhede <[EMAIL PROTECTED]> > > > > > > > The kafka-server-start.sh script doesn't have the mentioned GC > > > > settings and heap size configured. However, probably doing that is a > > > > good idea. > > > > > > > > Thanks, > > >
-
Re: Connection reset by peerYonghui Zhao 2013-03-28, 15:21
Thanks Jun.
But I can't understand how consumer GC trigger kafka server issue: java.lang.RuntimeException: A broker is already registered on the path /brokers/ids/0. This probably indicates that you either have configured a brokerid that is already in use, or else you have shutdown this broker and restarted it faster than the zookeeper timeout so it appears to be re-registering. 2013/3/28 Jun Rao <[EMAIL PROTECTED]> > The zk session timeout only kicks in if you force kill the consumer. > Otherwise, consumer will close ZK session properly on clean shutdown. > > The problem with GC is that if the consumer pauses for a long time, ZK > server won't receive pings from the client and thus can expire a still > existing session. > > The best thing to do here is to fix the GC issue since it may have other > implications. To start with, you probably want to enable GC logging and see > how long and how frequent your GCs are. > > Thanks, > > Jun > > On Thu, Mar 28, 2013 at 12:23 AM, Yonghui Zhao <[EMAIL PROTECTED] > >wrote: > > > I used zookeeper-3.3.4 in kafka. > > > > Default tickTime is 3 seconds, minSesstionTimeOut is 6 seconds. > > Now I change tickTime to 5 seconds. minSesstionTimeOut to 10 seconds > > But if we change timeout to a larger one, > > "you have shutdown this broker and restarted it faster than the zookeeper > > timeout so it appears to be re-registering." > > this could happened more easily > > > > Do you think consumer GC will affect kafka server and zk connection? > > > > > > > > 2013/3/28 Jun Rao <[EMAIL PROTECTED]> > > > > > Not sure why the re-registration fails. Are you using ZK 3.3.4 or > above? > > > > > > It seems that you consumer still GCs, which is the root cause. So, you > > will > > > need to tune the GC setting further. Another way to avoid ZK session > > > timeout is to increase the session timeout config. > > > > > > Thanks, > > > > > > Jun > > > > > > On Wed, Mar 27, 2013 at 8:35 PM, Yonghui Zhao <[EMAIL PROTECTED]> > > > wrote: > > > > > > > Now I used GC like this: > > > > > > > > -server -Xms1536m -Xmx1536m -XX:NewSize=128m -XX:MaxNewSize=128m > > > > -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > > > > -XX:CMSInitiatingOccupancyFraction=70 > > > > > > > > > > > > But it still happened. It seems kafka server reconnect with zk, but > > the > > > > old node was still there. So kafka server stopped. > > > > Can kafka server retry to connect with zk? > > > > > > > > > > > > 2013-03-27 22:15:03,529] INFO Opening socket connection to server > > > > localhost/ > > > > 127.0.0.1:2181 (org.apache.zookeeper.ClientCnxn) > > > > [2013-03-27 22:15:03,529] INFO Socket connection established to > > > localhost/ > > > > 127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn) > > > > [2013-03-27 22:15:05,855] INFO Session establishment complete on > server > > > > localhost/127.0.0.1:2181, sessionid = 0x13da6d94abf00aa, negotiated > > > > timeout > > > > = 6000 (org.apache.zookeeper.ClientCnxn) > > > > [2013-03-27 22:15:05,942] INFO zookeeper state changed > (SyncConnected) > > > > (org.I0Itec.zkclient.ZkClient) > > > > [2013-03-27 22:15:14,912] INFO conflict in /brokers/ids/0 data: > > > > 127.0.0.1-1364393691770:127.0.0.1:9093 stored data: null > > > > (kafka.utils.ZkUtils$) > > > > [2013-03-27 22:15:14,942] ERROR Error handling event ZkEvent[New > > session > > > > event sent to > > kafka.server.KafkaZooKeeper$SessionExpireListener@18f389bc > > > ] > > > > (org.I0Itec.zkclient.ZkEventThread) > > > > java.lang.RuntimeException: A broker is already registered on the > path > > > > /brokers/ids/0. This probably indicates that you either have > > configured a > > > > brokerid that is already in use, or else you have shutdown this > broker > > > and > > > > restarted it faster than the zookeeper timeout so it appears to be > > > > re-registering. > > > > at > > > > > kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:57) > > > > at > > > > > > > > > > > > > > kafka.server.KafkaZooKeeper$SessionExpireListener.handleNewSession(KafkaZooKeeper.scala:100)
-
Re: Connection reset by peerJun Rao 2013-03-28, 15:25
Do you see lots of ZK session expiration in the broker too? If so, that
suggests a GC issue in the broker too. So, you may need to tune the GC in the broker as well. Thanks, Jun On Thu, Mar 28, 2013 at 8:20 AM, Yonghui Zhao <[EMAIL PROTECTED]> wrote: > Thanks Jun. > > But I can't understand how consumer GC trigger kafka server issue: > java.lang.RuntimeException: A broker is already registered on the path > /brokers/ids/0. This probably indicates that you either have configured a > brokerid that is already in use, or else you have shutdown this broker and > restarted it faster than the zookeeper timeout so it appears to be > re-registering. > > > 2013/3/28 Jun Rao <[EMAIL PROTECTED]> > > > The zk session timeout only kicks in if you force kill the consumer. > > Otherwise, consumer will close ZK session properly on clean shutdown. > > > > The problem with GC is that if the consumer pauses for a long time, ZK > > server won't receive pings from the client and thus can expire a still > > existing session. > > > > The best thing to do here is to fix the GC issue since it may have other > > implications. To start with, you probably want to enable GC logging and > see > > how long and how frequent your GCs are. > > > > Thanks, > > > > Jun > > > > On Thu, Mar 28, 2013 at 12:23 AM, Yonghui Zhao <[EMAIL PROTECTED] > > >wrote: > > > > > I used zookeeper-3.3.4 in kafka. > > > > > > Default tickTime is 3 seconds, minSesstionTimeOut is 6 seconds. > > > Now I change tickTime to 5 seconds. minSesstionTimeOut to 10 seconds > > > But if we change timeout to a larger one, > > > "you have shutdown this broker and restarted it faster than the > zookeeper > > > timeout so it appears to be re-registering." > > > this could happened more easily > > > > > > Do you think consumer GC will affect kafka server and zk connection? > > > > > > > > > > > > 2013/3/28 Jun Rao <[EMAIL PROTECTED]> > > > > > > > Not sure why the re-registration fails. Are you using ZK 3.3.4 or > > above? > > > > > > > > It seems that you consumer still GCs, which is the root cause. So, > you > > > will > > > > need to tune the GC setting further. Another way to avoid ZK session > > > > timeout is to increase the session timeout config. > > > > > > > > Thanks, > > > > > > > > Jun > > > > > > > > On Wed, Mar 27, 2013 at 8:35 PM, Yonghui Zhao <[EMAIL PROTECTED] > > > > > > wrote: > > > > > > > > > Now I used GC like this: > > > > > > > > > > -server -Xms1536m -Xmx1536m -XX:NewSize=128m -XX:MaxNewSize=128m > > > > > -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > > > > > -XX:CMSInitiatingOccupancyFraction=70 > > > > > > > > > > > > > > > But it still happened. It seems kafka server reconnect with zk, > but > > > the > > > > > old node was still there. So kafka server stopped. > > > > > Can kafka server retry to connect with zk? > > > > > > > > > > > > > > > 2013-03-27 22:15:03,529] INFO Opening socket connection to server > > > > > localhost/ > > > > > 127.0.0.1:2181 (org.apache.zookeeper.ClientCnxn) > > > > > [2013-03-27 22:15:03,529] INFO Socket connection established to > > > > localhost/ > > > > > 127.0.0.1:2181, initiating session > (org.apache.zookeeper.ClientCnxn) > > > > > [2013-03-27 22:15:05,855] INFO Session establishment complete on > > server > > > > > localhost/127.0.0.1:2181, sessionid = 0x13da6d94abf00aa, > negotiated > > > > > timeout > > > > > = 6000 (org.apache.zookeeper.ClientCnxn) > > > > > [2013-03-27 22:15:05,942] INFO zookeeper state changed > > (SyncConnected) > > > > > (org.I0Itec.zkclient.ZkClient) > > > > > [2013-03-27 22:15:14,912] INFO conflict in /brokers/ids/0 data: > > > > > 127.0.0.1-1364393691770:127.0.0.1:9093 stored data: null > > > > > (kafka.utils.ZkUtils$) > > > > > [2013-03-27 22:15:14,942] ERROR Error handling event ZkEvent[New > > > session > > > > > event sent to > > > kafka.server.KafkaZooKeeper$SessionExpireListener@18f389bc > > > > ] > > > > > (org.I0Itec.zkclient.ZkEventThread) > > > > > java.lang.RuntimeException: A broker is already registered on the
-
Re: Connection reset by peerYonghui Zhao 2013-03-28, 15:32
Will do a check, I just wonder why broker need re-regiester and it failed,
so broker service is stopped. 2013/3/28 Jun Rao <[EMAIL PROTECTED]> > Do you see lots of ZK session expiration in the broker too? If so, that > suggests a GC issue in the broker too. So, you may need to tune the GC in > the broker as well. > > Thanks, > > Jun > > On Thu, Mar 28, 2013 at 8:20 AM, Yonghui Zhao <[EMAIL PROTECTED]> > wrote: > > > Thanks Jun. > > > > But I can't understand how consumer GC trigger kafka server issue: > > java.lang.RuntimeException: A broker is already registered on the path > > /brokers/ids/0. This probably indicates that you either have configured a > > brokerid that is already in use, or else you have shutdown this broker > and > > restarted it faster than the zookeeper timeout so it appears to be > > re-registering. > > > > > > 2013/3/28 Jun Rao <[EMAIL PROTECTED]> > > > > > The zk session timeout only kicks in if you force kill the consumer. > > > Otherwise, consumer will close ZK session properly on clean shutdown. > > > > > > The problem with GC is that if the consumer pauses for a long time, ZK > > > server won't receive pings from the client and thus can expire a still > > > existing session. > > > > > > The best thing to do here is to fix the GC issue since it may have > other > > > implications. To start with, you probably want to enable GC logging and > > see > > > how long and how frequent your GCs are. > > > > > > Thanks, > > > > > > Jun > > > > > > On Thu, Mar 28, 2013 at 12:23 AM, Yonghui Zhao <[EMAIL PROTECTED] > > > >wrote: > > > > > > > I used zookeeper-3.3.4 in kafka. > > > > > > > > Default tickTime is 3 seconds, minSesstionTimeOut is 6 seconds. > > > > Now I change tickTime to 5 seconds. minSesstionTimeOut to 10 seconds > > > > But if we change timeout to a larger one, > > > > "you have shutdown this broker and restarted it faster than the > > zookeeper > > > > timeout so it appears to be re-registering." > > > > this could happened more easily > > > > > > > > Do you think consumer GC will affect kafka server and zk connection? > > > > > > > > > > > > > > > > 2013/3/28 Jun Rao <[EMAIL PROTECTED]> > > > > > > > > > Not sure why the re-registration fails. Are you using ZK 3.3.4 or > > > above? > > > > > > > > > > It seems that you consumer still GCs, which is the root cause. So, > > you > > > > will > > > > > need to tune the GC setting further. Another way to avoid ZK > session > > > > > timeout is to increase the session timeout config. > > > > > > > > > > Thanks, > > > > > > > > > > Jun > > > > > > > > > > On Wed, Mar 27, 2013 at 8:35 PM, Yonghui Zhao < > [EMAIL PROTECTED] > > > > > > > > wrote: > > > > > > > > > > > Now I used GC like this: > > > > > > > > > > > > -server -Xms1536m -Xmx1536m -XX:NewSize=128m -XX:MaxNewSize=128m > > > > > > -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > > > > > > -XX:CMSInitiatingOccupancyFraction=70 > > > > > > > > > > > > > > > > > > But it still happened. It seems kafka server reconnect with zk, > > but > > > > the > > > > > > old node was still there. So kafka server stopped. > > > > > > Can kafka server retry to connect with zk? > > > > > > > > > > > > > > > > > > 2013-03-27 22:15:03,529] INFO Opening socket connection to server > > > > > > localhost/ > > > > > > 127.0.0.1:2181 (org.apache.zookeeper.ClientCnxn) > > > > > > [2013-03-27 22:15:03,529] INFO Socket connection established to > > > > > localhost/ > > > > > > 127.0.0.1:2181, initiating session > > (org.apache.zookeeper.ClientCnxn) > > > > > > [2013-03-27 22:15:05,855] INFO Session establishment complete on > > > server > > > > > > localhost/127.0.0.1:2181, sessionid = 0x13da6d94abf00aa, > > negotiated > > > > > > timeout > > > > > > = 6000 (org.apache.zookeeper.ClientCnxn) > > > > > > [2013-03-27 22:15:05,942] INFO zookeeper state changed > > > (SyncConnected) > > > > > > (org.I0Itec.zkclient.ZkClient) > > > > > > [2013-03-27 22:15:14,912] INFO conflict in /brokers/ids/0 data: > > > > > > 127.0.0.1-1364393691770:127.0.0.1:9093 stored data: null
-
Re: Connection reset by peerJun Rao 2013-03-29, 04:03
Not sure why re-registering in broker fails. Normall, when the broker
registers, the ZK path should already be gone. Thanks, Jun On Thu, Mar 28, 2013 at 8:31 AM, Yonghui Zhao <[EMAIL PROTECTED]> wrote: > Will do a check, I just wonder why broker need re-regiester and it failed, > so broker service is stopped. > > 2013/3/28 Jun Rao <[EMAIL PROTECTED]> > > > Do you see lots of ZK session expiration in the broker too? If so, that > > suggests a GC issue in the broker too. So, you may need to tune the GC in > > the broker as well. > > > > Thanks, > > > > Jun > > > > On Thu, Mar 28, 2013 at 8:20 AM, Yonghui Zhao <[EMAIL PROTECTED]> > > wrote: > > > > > Thanks Jun. > > > > > > But I can't understand how consumer GC trigger kafka server issue: > > > java.lang.RuntimeException: A broker is already registered on the path > > > /brokers/ids/0. This probably indicates that you either have > configured a > > > brokerid that is already in use, or else you have shutdown this broker > > and > > > restarted it faster than the zookeeper timeout so it appears to be > > > re-registering. > > > > > > > > > 2013/3/28 Jun Rao <[EMAIL PROTECTED]> > > > > > > > The zk session timeout only kicks in if you force kill the consumer. > > > > Otherwise, consumer will close ZK session properly on clean shutdown. > > > > > > > > The problem with GC is that if the consumer pauses for a long time, > ZK > > > > server won't receive pings from the client and thus can expire a > still > > > > existing session. > > > > > > > > The best thing to do here is to fix the GC issue since it may have > > other > > > > implications. To start with, you probably want to enable GC logging > and > > > see > > > > how long and how frequent your GCs are. > > > > > > > > Thanks, > > > > > > > > Jun > > > > > > > > On Thu, Mar 28, 2013 at 12:23 AM, Yonghui Zhao < > [EMAIL PROTECTED] > > > > >wrote: > > > > > > > > > I used zookeeper-3.3.4 in kafka. > > > > > > > > > > Default tickTime is 3 seconds, minSesstionTimeOut is 6 seconds. > > > > > Now I change tickTime to 5 seconds. minSesstionTimeOut to 10 > seconds > > > > > But if we change timeout to a larger one, > > > > > "you have shutdown this broker and restarted it faster than the > > > zookeeper > > > > > timeout so it appears to be re-registering." > > > > > this could happened more easily > > > > > > > > > > Do you think consumer GC will affect kafka server and zk > connection? > > > > > > > > > > > > > > > > > > > > 2013/3/28 Jun Rao <[EMAIL PROTECTED]> > > > > > > > > > > > Not sure why the re-registration fails. Are you using ZK 3.3.4 or > > > > above? > > > > > > > > > > > > It seems that you consumer still GCs, which is the root cause. > So, > > > you > > > > > will > > > > > > need to tune the GC setting further. Another way to avoid ZK > > session > > > > > > timeout is to increase the session timeout config. > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Jun > > > > > > > > > > > > On Wed, Mar 27, 2013 at 8:35 PM, Yonghui Zhao < > > [EMAIL PROTECTED] > > > > > > > > > > wrote: > > > > > > > > > > > > > Now I used GC like this: > > > > > > > > > > > > > > -server -Xms1536m -Xmx1536m -XX:NewSize=128m > -XX:MaxNewSize=128m > > > > > > > -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > > > > > > > -XX:CMSInitiatingOccupancyFraction=70 > > > > > > > > > > > > > > > > > > > > > But it still happened. It seems kafka server reconnect with > zk, > > > but > > > > > the > > > > > > > old node was still there. So kafka server stopped. > > > > > > > Can kafka server retry to connect with zk? > > > > > > > > > > > > > > > > > > > > > 2013-03-27 22:15:03,529] INFO Opening socket connection to > server > > > > > > > localhost/ > > > > > > > 127.0.0.1:2181 (org.apache.zookeeper.ClientCnxn) > > > > > > > [2013-03-27 22:15:03,529] INFO Socket connection established to > > > > > > localhost/ > > > > > > > 127.0.0.1:2181, initiating session > > > (org.apache.zookeeper.ClientCnxn) > > > > > > > [2013-03-27 22:15:05,855] INFO Session establishment complete |