Hi, the mail list has been moved to [EMAIL PROTECTED] long time ago. You may resubscribe the list or the list may be removed in future. At 2013-03-28 15:39:07,"Jason Rosenberg" <[EMAIL PROTECTED]> wrote:
How do you know the server had written the lost data to its log ? In Kafka 0.7, data could be lost from the producer's or server's socket buffer. You can verify this by running DumpLogSegments before and after shutdown.
I enabled TRACE logging on the producer, and verified that it successfully wrote out the bytes to the server (but this was after the last log flush on the server, where trace logging was also enabled).
I'm quite certain that the server is receiving the lost data, and then shutting down, before flushing the last few log lines received.
It would seem to be an easy fix to update KafkaServer.shutdown, to close all sockets, then do one final log flush that doesn't wait for the flush interval, and then close down.
On Thu, Mar 28, 2013 at 6:45 AM, Neha Narkhede <[EMAIL PROTECTED]> wrote:
I see now that in fact, it does close all logs during LogManager.close(), which deeper in the code flushes logSegments. So it doesn't do so as explicitly as in LogManager.flushAllLogs during the normally scheduled flush interval.
The confusing thing, is that I clearly see my message being sent from the producer, e.g.:
Here's the shutdown sequence on the server, which appears to complete before the last message was sent above:
2013-03-28 10:40:38,660 INFO [Thread-27] server.KafkaServer - Shutting down Kafka server 2013-03-28 10:40:38,661 INFO [Thread-27] utils.KafkaScheduler - shutdown scheduler kafka-logcleaner- 2013-03-28 10:40:38,670 INFO [Thread-27] utils.KafkaScheduler - shutdown scheduler kafka-logflusher- 2013-03-28 10:40:38,673 DEBUG [Thread-27] message.FileMessageSet - flush time 2 2013-03-28 10:40:38,673 DEBUG [Thread-27] message.FileMessageSet - flush high water mark:227489 2013-03-28 10:40:38,674 DEBUG [Thread-27] message.FileMessageSet - flush time 0 2013-03-28 10:40:38,674 DEBUG [Thread-27] message.FileMessageSet - flush high water mark:37271451 ... ... 2013-03-28 10:40:38,683 INFO [Thread-27] server.KafkaServer - Kafka server shut down completed
So, I am guessing now that the 499 bytes written message was going to a buffered socket channel, that received the bytes, but didn't actually send them out at the os level. No exception is thrown, so I am wondering whether this socket is ever flushed, or if it just quietly fails to send its data. Thoughts? Even if I switched to use a broker list (instead of zk for broker discovery), it seems that if no exception is thrown on the send to a closed broker, there's no way to manage retries, etc. Looking deeper, it looks like after I call send on my async producer, there are log messages coming from zk indicating the broker list has changed, but it's too late for the async producer to reroute the message:
2013-03-28 10:40:38,440 DEBUG [...] producer.Producer - Sending message to broker .... 2013-03-28 10:40:38,440 DEBUG [...] producer.ProducerPool - Fetching async producer for broker id: ... 2013-03-28 10:40:38,440 DEBUG [...] producer.ProducerPool - Sending message <broker shuts down> 2013-03-28 10:40:38,694 DEBUG [ZkClient-EventThread-35-localhost:26101] producer.ZKBrokerPartitionInfo$BrokerTopicsListener - [BrokerTopicsListener] List of brokers changed at /brokers/topics/... ... ... 2013-03-28 10:40:39,089 DEBUG [ProducerSendThread-1585330284] async.ProducerSendThread - 1001 ms elapsed. Queue time reached. Sending.. 2013-03-28 10:40:39,089 DEBUG [ProducerSendThread-1585330284] async.ProducerSendThread - Handling 1 events 2013-03-28 10:40:39,089 TRACE [ProducerSendThread-1585330284] async.DefaultEventHandler - Handling event for Topic: ... 2013-03-28 10:40:39,090 TRACE [ProducerSendThread-1585330284] async.DefaultEventHandler - Sending 1 messages with no compression to topic ... <send still goes to previously shut down broker> 2013-03-28 10:40:39,090 TRACE [ProducerSendThread-1585330284] network.BoundedByteBufferSend - 499 bytes written.
buffered socket channel, that received the bytes, but didn't actually send them out at the os level. No exception is thrown, so I am wondering whether this socket is ever flushed, or if it just quietly fails to send its data. Thoughts?
That's correct. When the producer sends some data, it enters the os socket buffer on the producer. Once it is flushed, it is then buffered in the server's socket buffer before being read by the server and appended to the log. On producer or server shutdown, messages can be lost since they are not flushed from the kernel's socket buffer. In 0.7, you will not notice any error. In 0.8, your producer will wait to read the response from the server and either timeout or throw a broken pipe exception. In this case, you can choose to retry.
I think you are referring to yet another issue that only happens on the 0.7 zookeeper based producer. If you shutdown any broker, messages in the async producer's queue are lost. This is because, it load balances and picks a target broker when the message enters the queue. So even if it can detect the broker is gone, it cannot change the target broker for the messages in the producer's queue.
Both of these issues are fixed in 0.8.
On Thu, Mar 28, 2013 at 11:44 AM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:
I'll look forward to 0.8 (maybe start playing with the latest version next week).
On Thu, Mar 28, 2013 at 2:18 PM, Neha Narkhede <[EMAIL PROTECTED]>wrote:
NEW: Monitor These Apps!
Apache Lucene, Apache Solr and all other Apache Software Foundation project and their respective logos are trademarks of the Apache Software Foundation.
Elasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S. and in other countries. This site and Sematext Group is in no way affiliated with Elasticsearch BV.
Service operated by Sematext