2013-09-24 17:21:47,513 - [kafka-acceptor:Acceptor@153] - Error in acceptor java.io.IOException: Too many open
The obvious fix is to bump up the number of open files but I'm wondering if there is a leak on the Kafka side and/or our application side. We currently have the ulimit set to a generous 4096 but obviously we are hitting this ceiling. What's a recommended value?
We are running rails and our Unicorn workers are connecting to our Kafka cluster via round-robin load balancing. We have about 1500 workers to that would be 1500 connections right there but they should be split across our 3 nodes. Instead Netstat shows thousands of connections that look like this:
tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:10.99.99.1:22503 ESTABLISHED tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:10.99.99.1:48398 ESTABLISHED tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:10.99.99.2:29617 ESTABLISHED tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:10.99.99.1:32444 ESTABLISHED tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:10.99.99.1:34415 ESTABLISHED tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:10.99.99.1:56901 ESTABLISHED tcp 0 0 kafka1.mycompany.:XmlIpcRegSvc ::ffff:10.99.99.2:45349 ESTABLISHED
Has anyone come across this problem before? Is this a 0.7.2 leak, LB misconfiguration… ?
We haven't seen any socket leaks with the java producer. If you have lots of unexplained socket connections in established mode, one possible cause is that the client created new producer instances, but didn't close the old ones.
Jun On Wed, Sep 25, 2013 at 6:08 AM, Mark <[EMAIL PROTECTED]> wrote:
No, this is all within the same DC. I think the problem has to do with the LB. We've upgraded our producers to point directory to a node for testing and after running it all night, I don't see any more connections then there are supposed to be.
Can I ask which LB are you using? We are using A10's
On Sep 26, 2013, at 6:41 PM, Nicolas Berthet <[EMAIL PROTECTED]> wrote:
we did run into a similar issue here (lots of ESTABLISHED connections on the brokers, but non on the consumers/producers). In our case, it was a firewall issue where connections that were inactive for more than a certain time were silently dropped by the firewall (but no TCP RST was sent) and only one side of the connection noticed the drop.
Maybe that helps
Flo On 2013-09-26 9:41 PM, Nicolas Berthet wrote:
NEW: Monitor These Apps!
Apache Lucene, Apache Solr and all other Apache Software Foundation projects and their respective logos are trademarks of the Apache Software Foundation.
Elasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S. and in other countries. This site and Sematext Group is in no way affiliated with Elasticsearch BV.
Service operated by Sematext