Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka, mail # user - log file flushing, flushall on clean shutdown...


+
Jason Rosenberg 2013-03-28, 07:39
+
SuoNayi 2013-03-28, 08:47
+
Neha Narkhede 2013-03-28, 13:45
+
Jason Rosenberg 2013-03-28, 15:12
+
Jun Rao 2013-03-28, 15:31
+
Jason Rosenberg 2013-03-28, 18:45
Copy link to this message
-
Re: log file flushing, flushall on clean shutdown...
Neha Narkhede 2013-03-28, 21:19
>> So, I am guessing now that the 499 bytes written message was going to a
buffered socket channel, that received the bytes, but didn't actually send
them out at the os level.  No exception is thrown, so I am wondering
whether this socket is ever flushed, or if it just quietly fails to send
its data.  Thoughts?

That's correct. When the producer sends some data, it enters the os
socket buffer on the producer. Once it is flushed, it is then buffered
in the server's socket buffer before being read by the server and
appended to the log. On producer or server shutdown, messages can be
lost since they are not flushed from the kernel's socket buffer. In
0.7, you will not notice any error. In 0.8, your producer will wait to
read the response from the server and either timeout or throw a broken
pipe exception. In this case, you can choose to retry.

I think you are referring to yet another issue that only happens on
the 0.7 zookeeper based producer. If you shutdown any broker, messages
in the async producer's queue are lost. This is because, it load
balances and picks a target broker when the message enters the queue.
So even if it can detect the broker is gone, it cannot change the
target broker for the messages in the producer's queue.

Both of these issues are fixed in 0.8.

Thanks,
Neha

On Thu, Mar 28, 2013 at 11:44 AM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:
> Ok, sorry,
>
> I see now that in fact, it does close all logs during LogManager.close(),
> which deeper in the code flushes logSegments.  So it doesn't do so as
> explicitly as in LogManager.flushAllLogs during the normally scheduled
> flush interval.
>
> The confusing thing, is that I clearly see my message being sent from the
> producer, e.g.:
>
> 2013-03-28 10:40:39,090 TRACE [ProducerSendThread-1585330284]
> network.BoundedByteBufferSend - 499 bytes written.
>
> Here's the shutdown sequence on the server, which appears to complete
> before the last message was sent above:
>
> 2013-03-28 10:40:38,660  INFO [Thread-27] server.KafkaServer - Shutting
> down Kafka server
> 2013-03-28 10:40:38,661  INFO [Thread-27] utils.KafkaScheduler - shutdown
> scheduler kafka-logcleaner-
> 2013-03-28 10:40:38,670  INFO [Thread-27] utils.KafkaScheduler - shutdown
> scheduler kafka-logflusher-
> 2013-03-28 10:40:38,673 DEBUG [Thread-27] message.FileMessageSet - flush
> time 2
> 2013-03-28 10:40:38,673 DEBUG [Thread-27] message.FileMessageSet - flush
> high water mark:227489
> 2013-03-28 10:40:38,674 DEBUG [Thread-27] message.FileMessageSet - flush
> time 0
> 2013-03-28 10:40:38,674 DEBUG [Thread-27] message.FileMessageSet - flush
> high water mark:37271451
> ...
> ...
> 2013-03-28 10:40:38,683  INFO [Thread-27] server.KafkaServer - Kafka server
> shut down completed
>
> So, I am guessing now that the 499 bytes written message was going to a
> buffered socket channel, that received the bytes, but didn't actually send
> them out at the os level.  No exception is thrown, so I am wondering
> whether this socket is ever flushed, or if it just quietly fails to send
> its data.  Thoughts?  Even if I switched to use a broker list (instead of
> zk for broker discovery), it seems that if no exception is thrown on the
> send to a closed broker, there's no way to manage retries, etc.
>
>
> Looking deeper, it looks like after I call send on my async producer, there
> are log messages coming from zk indicating the broker list has changed, but
> it's too late for the async producer to reroute the message:
>
> 2013-03-28 10:40:38,440 DEBUG [...] producer.Producer - Sending message to
> broker ....
> 2013-03-28 10:40:38,440 DEBUG [...] producer.ProducerPool - Fetching async
> producer for broker id: ...
> 2013-03-28 10:40:38,440 DEBUG [...] producer.ProducerPool - Sending message
> <broker shuts down>
> 2013-03-28 10:40:38,694 DEBUG [ZkClient-EventThread-35-localhost:26101]
> producer.ZKBrokerPartitionInfo$BrokerTopicsListener -
> [BrokerTopicsListener] List of brokers changed at /brokers/topics/...
> ...

 
+
Jason Rosenberg 2013-03-29, 02:09