Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> Re: Kafka broker problem


Copy link to this message
-
Re: Kafka broker problem
Bob,

It seems you are probably reaching the limit of open files on that box. In
Kafka 0.8, we keep the file handles for all segment files open until they
are garbage collection. Depending on the size of your cluster, this number
can be pretty big. Few 10 K or so.

Thanks,
Neha
On Thu, Mar 7, 2013 at 3:57 PM, Bob Jervis
<[EMAIL PROTECTED]>wrote:

> We have a test cluster running 0.8 that is not behaving properly.  It is
> almost continuously spewing the following exception into its log:
>
>
> 2013-03-07 23:44:17,532 ERROR kafka.network.Processor: Closing socket for /
> 10.10.2.123 because of error
> java.io.IOException: Resource temporarily unavailable
>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>         at
> sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)
>         at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)
>         at kafka.log.FileMessageSet.writeTo(FileMessageSet.scala:133)
>         at kafka.api.PartitionDataSend.writeTo(FetchResponse.scala:73)
>         at kafka.network.MultiSend.writeTo(Transmission.scala:94)
>         at kafka.network.Send$class.writeCompletely(Transmission.scala:75)
>         at kafka.network.MultiSend.writeCompletely(Transmission.scala:87)
>         at kafka.api.TopicDataSend.writeTo(FetchResponse.scala:128)
>         at kafka.network.MultiSend.writeTo(Transmission.scala:94)
>         at kafka.network.Send$class.writeCompletely(Transmission.scala:75)
>         at kafka.network.MultiSend.writeCompletely(Transmission.scala:87)
>         at kafka.api.FetchResponseSend.writeTo(FetchResponse.scala:223)
>         at kafka.network.Processor.write(SocketServer.scala:318)
>         at kafka.network.Processor.run(SocketServer.scala:211)
>         at java.lang.Thread.run(Thread.java:619)
>
> And our consumer is reporting the following:
>
>
> 2013-03-07 23:46:09,736 INFO kafka.consumer.SimpleConsumer: Reconnect due
> to socket error:
> java.io.EOFException: Received -1 when reading from channel, socket has
> likely been closed.
>         at kafka.utils.Utils$.read(Utils.scala:373)
>         at
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:67)
>         at
> kafka.network.Receive$class.readCompletely(Transmission.scala:56)
>         at
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
>         at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
>         at
> kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:124)
>         at
> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:122)
>         at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:161)
>         at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:161)
>         at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:161)
>         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>         at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:160)
>         at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:160)
>         at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:160)
>         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
>         at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:159)
>         at
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:93)
>         at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
> 2013-03-07 23:46:09,740 INFO kafka.consumer.ConsumerFetcherManager:
> [ConsumerFetcherManager-1362697806347] removing fetcher on topic
> VTFull-enriched, partition 0
>
> We have several other environments running the same code without error.
>
> This is a CentOS server issuing these log errors.
>
> We have both Ubuntu and CentOS environments working.

 
+
Jun Rao 2013-03-08, 06:13