Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> large amount of disk space freed on restart


Copy link to this message
-
Re: large amount of disk space freed on restart
Cool, good to know.
On Fri, Jul 26, 2013 at 2:00 PM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:

> Jay,
>
> My only experience so far with this is using XFS.  It appears the XFS
> behavior is evolving, and in fact, we see somewhat different behavior from
> 2 of our CentOS kernel versions in use.  I've been trying to ask questions
> about all this on the XFS.org mailing list, but so far, having not much
> luck understanding the xfs versioning correlated to CentOS versions.
>
> Anyway, yes, I think it would definitely be worth trying the solution you
> suggest, which would be to close the file on rotation, and re-open
> read-only.  Or to close files after a few hours of not being accessed.   If
> a patch for one of these approaches can be cobbled together, I'd love to
> test it out on our staging environment.  I'd be willing to experiment with
> such a patch myself, although I'm not 100% of all the places to look (but
> might dive in).
>
> Xfs appears to the option of using dynamic, speculative preallocation, in
> which case it progressively doubles the amount of space reserved for a
> file, as the file grows.  It does do this for all open files.  If the file
> is closed, it will then release the preallocated space not in use.  It's
> not clear whether this releasing of space happens immediately on close, and
> whether re-opening the file read-only immediately, will keep it from
> releasing space (still trying to gather more info on that).
>
> I haven't looked too much at the index files, but those too appear to have
> this behavior (e.g. preallocated size is always on the order of double the
> actual size, until the app is restarted).
>
> Jason
>
>
> On Fri, Jul 26, 2013 at 12:46 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:
>
> > Interesting.
> >
> > Yes, Kafka keeps all log files open indefinitely. There is no inherent
> > reason this needs to be the case, though, it would be possible to LRU out
> > old file descriptors and close them if they are not accessed for a few
> > hours and then reopen on the first access. We just haven't implemented
> > anything like that.
> >
> > It would be good to understand this a little better. Does xfs
> pre-allocate
> > space for all open files? Perhaps just closing the file on log role and
> > opening it read-only would solve the issue? Is this at all related to the
> > use of sparse files for the indexes (i.e.
> RandomAccessFile.setLength(10MB)
> > when we create the index)? Does this effect other filesystems or just
> xfs?
> >
> > -Jay
> >
> >
> > On Fri, Jul 26, 2013 at 12:42 AM, Jason Rosenberg <[EMAIL PROTECTED]>
> > wrote:
> >
> > > It looks like xfs will reclaim the preallocated space for a file, after
> > it
> > > is closed.
> > >
> > > Does kafka close a file after it has reached it's max size and started
> > > writing to the next log file in sequence?  Or does it keep all open
> until
> > > they are deleted, or the server quits (that's what it seems like).
> > >
> > > I could imagine that it might need to keep log files open, in order to
> > > allow consumers access to them.  But does it keep them open
> indefinitely,
> > > after there is no longer any data to be written to them, and no
> consumers
> > > are currently attempting to read from them?
> > >
> > > Jason
> > >
> > >
> > > On Tue, Jul 16, 2013 at 4:32 PM, Jay Kreps <[EMAIL PROTECTED]>
> wrote:
> > >
> > > > Interesting. Yes it will respect whatever setting it is given for new
> > > > segments created from that point on.
> > > >
> > > > -Jay
> > > >
> > > >
> > > > On Tue, Jul 16, 2013 at 11:23 AM, Jason Rosenberg <[EMAIL PROTECTED]>
> > > > wrote:
> > > >
> > > > > Ok,
> > > > >
> > > > > An update on this.  It seems we are using XFS, which is available
> in
> > > > newer
> > > > > versions of Centos.  It definitely does pre-allocate space as a
> file
> > > > grows,
> > > > > see:
> > > > >
> > > > >
> > > >
> > >
> >
> http://serverfault.com/questions/406069/why-are-my-xfs-filesystems-suddenly-consuming-more-space-and-full-of-sparse-file

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB