Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - large amount of disk space freed on restart


Copy link to this message
-
Re: large amount of disk space freed on restart
Jun Rao 2013-05-24, 03:56
I haven't seen this issue before. We do have ~1K topics in one of the Kafka
clusters at LinkedIn.

Thanks,

Jun
On Thu, May 23, 2013 at 11:05 AM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:

> Yeah, that's what it looks like to me (looking at the code).  So, I'm
> guessing it's some os level caching, resource recycling.  Have you ever
> heard of this happening?  One thing that might be different in my usage
> from the norm is a relatively large number of topics (e.g. ~2K topics).
>
> Jason
>
>
> On Thu, May 23, 2013 at 7:14 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
>
> > Jason,
> >
> > Kafka closes the handler of all delete files. Otherwise, the broker will
> > run out of file handler quickly.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Wed, May 22, 2013 at 10:17 PM, Jason Rosenberg <[EMAIL PROTECTED]>
> > wrote:
> >
> > > So, does this indicate kafka (or the jvm itself) is not aggressively
> > > closing file handles of deleted files?  Is there a fix for this?  Or is
> > > there not likely anything to be done?  What happens if the disk fills
> up
> > > with file handles for phantom deleted files?
> > >
> > > Jason
> > >
> > >
> > > On Wed, May 22, 2013 at 9:50 PM, Jonathan Creasy <[EMAIL PROTECTED]> wrote:
> > >
> > > > It isn't uncommon if a process has an open file handle on a file that
> > is
> > > > deleted, the space is not freed until the handle is closed. So
> > restarting
> > > > the process that has a handle on the file would cause the space to be
> > > freed
> > > > also.
> > > >
> > > > You can troubleshoot that with lsof.
> > > > Normally, I see 2-4 log segments deleted every hour in my brokers.  I
> > see
> > > > log lines like this:
> > > >
> > > > 2013-05-23 04:40:06,857  INFO [kafka-logcleaner-0] log.LogManager -
> > > > Deleting log segment 00000000035434043157.kafka from <redacted topic>
> > > >
> > > > However, it seems like if I restart the broker, a massive amount of
> > disk
> > > > space is freed (without a corresponding flood of these log segment
> > > deleted
> > > > messages).  Is there an explanation for this?  Does kafka keep
> > reference
> > > to
> > > > file segments around, and reuse them as needed or something?  And
> then
> > or
> > > > restart, the references to those free segment files are dropped?
> > > >
> > > > Thoughts?
> > > >
> > > > This is with 0.7.2.
> > > >
> > > > Jason
> > > >
> > >
> >
>