Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka, mail # user - Kafka broker not respecting log.roll.hours?


+
Dan Frankowski 2013-04-25, 19:45
+
Jun Rao 2013-04-26, 04:49
+
Dan Frankowski 2013-04-26, 06:13
+
Jun Rao 2013-04-26, 14:40
+
Dan Frankowski 2013-04-26, 15:20
+
Jason Rosenberg 2013-04-26, 16:52
Copy link to this message
-
Re: Kafka broker not respecting log.roll.hours?
Adam Talaat 2013-04-26, 17:33
I don't know how Kafka's rollover algorithm is implemented, but this is
common behavior for other logging frameworks. You would need a separate
watcher/scheduled thread to rollover a log file, even if no events were
coming in. Logback (and probably log4j, by the same author) dispenses with
the watcher thread. Instead, it checks each message as it comes in and
decides whether the message should belong in a new file. If it should, a
rollover of the old file is triggered and the message is deposited in the
new file. But no rollover will occur until a message that belongs in a new
file arrives.

Cheers,
Adam

On Fri, Apr 26, 2013 at 9:52 AM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:

> By the way, is there a reason why 'log.roll.hours' is not documented on the
> apache configuration page:  http://kafka.apache.org/configuration.html ?
>
> It's possible to find this setting (and several other undocumented
> settings) by looking at the source code.  I'm just not sure why the
> complete set of options is not documented on the site (is it meant to be
> experimental?).
>
> Jason
>
>
> On Fri, Apr 26, 2013 at 8:19 AM, Dan Frankowski <[EMAIL PROTECTED]>
> wrote:
>
> > https://issues.apache.org/jira/browse/KAFKA-881
> >
> > Thanks.
> >
> >
> > On Fri, Apr 26, 2013 at 7:40 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
> >
> > > Yes, for low volume topic, the time-based rolling can be imprecise.
> Could
> > > you file a jira and describe your suggestions there? Ideally, we should
> > set
> > > firstAppendTime to the file creation time. However, it doesn't seem you
> > can
> > > get the creation time in java.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Thu, Apr 25, 2013 at 11:12 PM, Dan Frankowski <[EMAIL PROTECTED]>
> > > wrote:
> > >
> > > > We have high-volume topics and low-volume topics. The problem occurs
> > more
> > > > often for low-volume topics to be sure.
> > > >
> > > > But if my hypothesis is correct about why it is happening, here is a
> > case
> > > > where rolling is longer than an hour, even on a high volume topic:
> > > >
> > > > - write to a topic for 20 minutes
> > > > - restart the broker
> > > > - wait for 5 days
> > > > - write to a topic for 20 minutes
> > > > - restart the broker
> > > > - write to a topic for an hour
> > > >
> > > > The rollover time was now 5 days, 1 hour, 40 minutes. You can make it
> > as
> > > > long as you want. Does this make sense?
> > > >
> > > > We would like the rollover time to be no more than an hour, even if
> the
> > > > broker is restarted, or the topic is low-volume.
> > > >
> > > > The cleanest way to do that might be to roll over on the hour no
> matter
> > > > when the file was started. That would be too fast sometimes, but
> that's
> > > > fine. A second way would be to embed the first append time in the
> file
> > > > name. A third way (not perfect, but an approximation at least) would
> be
> > > to
> > > > not to write to a segment if firstAppendTime is not defined and the
> > > > timestamp on the file is more than an hour old. There are probably
> > other
> > > > ways.
> > > >
> > > > What say you?
> > > >
> > > >
> > > > On Thu, Apr 25, 2013 at 9:49 PM, Jun Rao <[EMAIL PROTECTED]> wrote:
> > > >
> > > > > That logic in 0.7.2 seems correct. Basically, firstAppendTime is
> set
> > on
> > > > > first append to a log segment. Then, later on, when a new message
> is
> > > > > appended and the elapsed time since firstAppendTime is larger than
> > the
> > > > roll
> > > > > time, a new segment is rolled. Is your data constantly being
> > produced?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > >
> > > > > On Thu, Apr 25, 2013 at 12:44 PM, Dan Frankowski <
> [EMAIL PROTECTED]
> > >
> > > > > wrote:
> > > > >
> > > > > > We are running Kafka 0.7.2. We set log.roll.hours=1. I hoped that
> > > meant
> > > > > > logs would be rolled every hour, or more. Only, sometimes logs
> that
> > > are
> > > > > > many hours (sometimes days) old have more data added to them.

 
+
Dan Frankowski 2013-04-27, 21:37
+
Dan Frankowski 2013-05-02, 21:23
+
Jun Rao 2013-05-03, 15:58