Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> race condition with log flush interval settings...


Copy link to this message
-
Re: race condition with log flush interval settings...
Jason,

Could you file a jira so that we can track it?

Thanks,

Jun

On Thu, Mar 28, 2013 at 12:43 AM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:

> It looks like there is a race condition between the settings for the 2
> properties:  log.default.flush.scheduler.interval.ms &
> log.default.flush.interval.ms.  I'm using 0.7.2.
>
> By default, both of these get set to 3000ms (and in the docs, it
> recommends setting flushInterval to be a multiple of the
> flushSchedulerInterval).
>
> However, the code in LogManager.flushAllLogs (which is scheduled to
> run at a fixed rate using the flushSchedulerInterval property) looks
> like this:
>
>         val timeSinceLastFlush = System.currentTimeMillis -
> log.getLastFlushedTime
>         var logFlushInterval = config.defaultFlushIntervalMs
>         ....
>         ....
>         if(timeSinceLastFlush >= logFlushInterval)
>           log.flush
>
> So, it will only flush logs if the the time since the last flush is
> longer than the flush interval.   But, the log.lastFlushedTime is not
> set until after flushing is completed (which can incur some io time).
> Thus, by enabling TRACE logging for this method, I was able to see
> that with the defaults, timeSinceLastFlush was usually about 2998
> (which is less than the logFlushInterval of 3000).  Thus, setting a
> flushInterval the same as the scheduler.flushInterval essentially
> devolves to an effective flushInterval = 2X the
> schedulerFlushInterval.
>
> So, setting a flushIinterval slightly less than the
> flushSchedulerInterval (e.g. 2500) will guarantee that the flush will
> happen on each scheduler invocation.
>
> I'm guessing that it might make sense to change the logic gating the
> flush to something like:
>
>       if(timeSinceLastFlush >= 0.90 * logFlushInterval)
>
> might be reasonable.  Also, the scheduler probably ought to use a
> 'fixedDelay' rather than a 'fixedRate' schedule.....
>
> Jason
>

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB