Home | About | Sematext search-lucene.com search-hadoop.com search-devops.com metrics + logs = try SPM and Logsene for free
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Re: large amount of disk space freed on restart


Copy link to this message
-
Re: large amount of disk space freed on restart
Jay,

My only experience so far with this is using XFS.  It appears the XFS
behavior is evolving, and in fact, we see somewhat different behavior from
2 of our CentOS kernel versions in use.  I've been trying to ask questions
about all this on the XFS.org mailing list, but so far, having not much
luck understanding the xfs versioning correlated to CentOS versions.

Anyway, yes, I think it would definitely be worth trying the solution you
suggest, which would be to close the file on rotation, and re-open
read-only.  Or to close files after a few hours of not being accessed.   If
a patch for one of these approaches can be cobbled together, I'd love to
test it out on our staging environment.  I'd be willing to experiment with
such a patch myself, although I'm not 100% of all the places to look (but
might dive in).

Xfs appears to the option of using dynamic, speculative preallocation, in
which case it progressively doubles the amount of space reserved for a
file, as the file grows.  It does do this for all open files.  If the file
is closed, it will then release the preallocated space not in use.  It's
not clear whether this releasing of space happens immediately on close, and
whether re-opening the file read-only immediately, will keep it from
releasing space (still trying to gather more info on that).

I haven't looked too much at the index files, but those too appear to have
this behavior (e.g. preallocated size is always on the order of double the
actual size, until the app is restarted).

Jason
On Fri, Jul 26, 2013 at 12:46 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:
 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB