Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka, mail # user - log file flushing, flushall on clean shutdown...


+
Jason Rosenberg 2013-03-28, 07:39
+
SuoNayi 2013-03-28, 08:47
+
Neha Narkhede 2013-03-28, 13:45
+
Jason Rosenberg 2013-03-28, 15:12
+
Jun Rao 2013-03-28, 15:31
+
Jason Rosenberg 2013-03-28, 18:45
+
Neha Narkhede 2013-03-28, 21:19
Copy link to this message
-
Re: log file flushing, flushall on clean shutdown...
Jason Rosenberg 2013-03-29, 02:09
Thanks Neha for the detailed comments.

I'll look forward to 0.8 (maybe start playing with the latest version next
week).

Jason

On Thu, Mar 28, 2013 at 2:18 PM, Neha Narkhede <[EMAIL PROTECTED]>wrote:

> >> So, I am guessing now that the 499 bytes written message was going to a
> buffered socket channel, that received the bytes, but didn't actually send
> them out at the os level.  No exception is thrown, so I am wondering
> whether this socket is ever flushed, or if it just quietly fails to send
> its data.  Thoughts?
>
> That's correct. When the producer sends some data, it enters the os
> socket buffer on the producer. Once it is flushed, it is then buffered
> in the server's socket buffer before being read by the server and
> appended to the log. On producer or server shutdown, messages can be
> lost since they are not flushed from the kernel's socket buffer. In
> 0.7, you will not notice any error. In 0.8, your producer will wait to
> read the response from the server and either timeout or throw a broken
> pipe exception. In this case, you can choose to retry.
>
> I think you are referring to yet another issue that only happens on
> the 0.7 zookeeper based producer. If you shutdown any broker, messages
> in the async producer's queue are lost. This is because, it load
> balances and picks a target broker when the message enters the queue.
> So even if it can detect the broker is gone, it cannot change the
> target broker for the messages in the producer's queue.
>
> Both of these issues are fixed in 0.8.
>
> Thanks,
> Neha
>
> On Thu, Mar 28, 2013 at 11:44 AM, Jason Rosenberg <[EMAIL PROTECTED]>
> wrote:
> > Ok, sorry,
> >
> > I see now that in fact, it does close all logs during LogManager.close(),
> > which deeper in the code flushes logSegments.  So it doesn't do so as
> > explicitly as in LogManager.flushAllLogs during the normally scheduled
> > flush interval.
> >
> > The confusing thing, is that I clearly see my message being sent from the
> > producer, e.g.:
> >
> > 2013-03-28 10:40:39,090 TRACE [ProducerSendThread-1585330284]
> > network.BoundedByteBufferSend - 499 bytes written.
> >
> > Here's the shutdown sequence on the server, which appears to complete
> > before the last message was sent above:
> >
> > 2013-03-28 10:40:38,660  INFO [Thread-27] server.KafkaServer - Shutting
> > down Kafka server
> > 2013-03-28 10:40:38,661  INFO [Thread-27] utils.KafkaScheduler - shutdown
> > scheduler kafka-logcleaner-
> > 2013-03-28 10:40:38,670  INFO [Thread-27] utils.KafkaScheduler - shutdown
> > scheduler kafka-logflusher-
> > 2013-03-28 10:40:38,673 DEBUG [Thread-27] message.FileMessageSet - flush
> > time 2
> > 2013-03-28 10:40:38,673 DEBUG [Thread-27] message.FileMessageSet - flush
> > high water mark:227489
> > 2013-03-28 10:40:38,674 DEBUG [Thread-27] message.FileMessageSet - flush
> > time 0
> > 2013-03-28 10:40:38,674 DEBUG [Thread-27] message.FileMessageSet - flush
> > high water mark:37271451
> > ...
> > ...
> > 2013-03-28 10:40:38,683  INFO [Thread-27] server.KafkaServer - Kafka
> server
> > shut down completed
> >
> > So, I am guessing now that the 499 bytes written message was going to a
> > buffered socket channel, that received the bytes, but didn't actually
> send
> > them out at the os level.  No exception is thrown, so I am wondering
> > whether this socket is ever flushed, or if it just quietly fails to send
> > its data.  Thoughts?  Even if I switched to use a broker list (instead of
> > zk for broker discovery), it seems that if no exception is thrown on the
> > send to a closed broker, there's no way to manage retries, etc.
> >
> >
> > Looking deeper, it looks like after I call send on my async producer,
> there
> > are log messages coming from zk indicating the broker list has changed,
> but
> > it's too late for the async producer to reroute the message:
> >
> > 2013-03-28 10:40:38,440 DEBUG [...] producer.Producer - Sending message
> to
> > broker ....
> > 2013-03-28 10:40:38,440 DEBUG [...] producer.ProducerPool - Fetching