Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - large amount of disk space freed on restart


Copy link to this message
-
Re: large amount of disk space freed on restart
Jay Kreps 2013-07-26, 21:03
Cool, good to know.
On Fri, Jul 26, 2013 at 2:00 PM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:

> Jay,
>
> My only experience so far with this is using XFS.  It appears the XFS
> behavior is evolving, and in fact, we see somewhat different behavior from
> 2 of our CentOS kernel versions in use.  I've been trying to ask questions
> about all this on the XFS.org mailing list, but so far, having not much
> luck understanding the xfs versioning correlated to CentOS versions.
>
> Anyway, yes, I think it would definitely be worth trying the solution you
> suggest, which would be to close the file on rotation, and re-open
> read-only.  Or to close files after a few hours of not being accessed.   If
> a patch for one of these approaches can be cobbled together, I'd love to
> test it out on our staging environment.  I'd be willing to experiment with
> such a patch myself, although I'm not 100% of all the places to look (but
> might dive in).
>
> Xfs appears to the option of using dynamic, speculative preallocation, in
> which case it progressively doubles the amount of space reserved for a
> file, as the file grows.  It does do this for all open files.  If the file
> is closed, it will then release the preallocated space not in use.  It's
> not clear whether this releasing of space happens immediately on close, and
> whether re-opening the file read-only immediately, will keep it from
> releasing space (still trying to gather more info on that).
>
> I haven't looked too much at the index files, but those too appear to have
> this behavior (e.g. preallocated size is always on the order of double the
> actual size, until the app is restarted).
>
> Jason
>
>
> On Fri, Jul 26, 2013 at 12:46 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:
>
> > Interesting.
> >
> > Yes, Kafka keeps all log files open indefinitely. There is no inherent
> > reason this needs to be the case, though, it would be possible to LRU out
> > old file descriptors and close them if they are not accessed for a few
> > hours and then reopen on the first access. We just haven't implemented
> > anything like that.
> >
> > It would be good to understand this a little better. Does xfs
> pre-allocate
> > space for all open files? Perhaps just closing the file on log role and
> > opening it read-only would solve the issue? Is this at all related to the
> > use of sparse files for the indexes (i.e.
> RandomAccessFile.setLength(10MB)
> > when we create the index)? Does this effect other filesystems or just
> xfs?
> >
> > -Jay
> >
> >
> > On Fri, Jul 26, 2013 at 12:42 AM, Jason Rosenberg <[EMAIL PROTECTED]>
> > wrote:
> >
> > > It looks like xfs will reclaim the preallocated space for a file, after
> > it
> > > is closed.
> > >
> > > Does kafka close a file after it has reached it's max size and started
> > > writing to the next log file in sequence?  Or does it keep all open
> until
> > > they are deleted, or the server quits (that's what it seems like).
> > >
> > > I could imagine that it might need to keep log files open, in order to
> > > allow consumers access to them.  But does it keep them open
> indefinitely,
> > > after there is no longer any data to be written to them, and no
> consumers
> > > are currently attempting to read from them?
> > >
> > > Jason
> > >
> > >
> > > On Tue, Jul 16, 2013 at 4:32 PM, Jay Kreps <[EMAIL PROTECTED]>
> wrote:
> > >
> > > > Interesting. Yes it will respect whatever setting it is given for new
> > > > segments created from that point on.
> > > >
> > > > -Jay
> > > >
> > > >
> > > > On Tue, Jul 16, 2013 at 11:23 AM, Jason Rosenberg <[EMAIL PROTECTED]>
> > > > wrote:
> > > >
> > > > > Ok,
> > > > >
> > > > > An update on this.  It seems we are using XFS, which is available
> in
> > > > newer
> > > > > versions of Centos.  It definitely does pre-allocate space as a
> file
> > > > grows,
> > > > > see:
> > > > >
> > > > >
> > > >
> > >
> >
> http://serverfault.com/questions/406069/why-are-my-xfs-filesystems-suddenly-consuming-more-space-and-full-of-sparse-file