Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> large amount of disk space freed on restart


+
Jason Rosenberg 2013-05-23, 04:46
+
Jonathan Creasy 2013-05-23, 04:51
+
Jason Rosenberg 2013-05-23, 05:17
+
Jonathan Creasy 2013-05-23, 05:25
+
Jason Rosenberg 2013-05-23, 05:49
+
Jun Rao 2013-05-23, 14:15
+
Jason Rosenberg 2013-05-23, 18:06
+
Jun Rao 2013-05-24, 03:56
+
Jason Rosenberg 2013-07-14, 08:37
+
Jay Kreps 2013-07-14, 16:45
+
Jason Rosenberg 2013-07-16, 18:23
+
Jay Kreps 2013-07-16, 20:32
+
Jason Rosenberg 2013-07-26, 07:43
+
Jay Kreps 2013-07-26, 16:46
+
Jason Rosenberg 2013-07-26, 21:00
+
Jay Kreps 2013-07-26, 21:03
+
Mike Heffner 2013-09-09, 15:18
+
Jay Kreps 2013-09-09, 17:48
Copy link to this message
-
Re: large amount of disk space freed on restart
Sorry, I forgot to close the loop on my experiences with this....

We solved this issue by setting the 'allocsize' mount option, in the fstab.
 E.g. allocsize=16M.

Jason
On Mon, Sep 9, 2013 at 1:47 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:

> This could certainly be done. It would be slightly involved since you would
> need to implement some kind of file-handle cache for both indexes and log
> files and re-open them on demand when a read occurs. If someone wants to
> take a shot at this, the first step would be to get a design wiki in place
> on how this would work. This is potentially nice to reduce the open file
> count (though open files are pretty cheap).
>
> That said this issue only impacts xfs and it seems to be fixed by that
> setting jonathan found. I wonder if you could give that a try and see if it
> works for you too? I feel dealing with closed files does add a lot of
> complexity so if there is an easy fix I would probably rather avoid it.
>
> -Jay
>
>
> On Mon, Sep 9, 2013 at 8:17 AM, Mike Heffner <[EMAIL PROTECTED]> wrote:
>
> > We are also seeing this problem with version 0.7.1 and logs on an XFS
> > partition. At our largest scale we can frequently free over 600GB of disk
> > usage by simply restarting Kafka. We've examined the `lsof` output from
> the
> > Kafka process and while it does appear to have FDs open for all log files
> > on disk (even those long past read from), it does not have any files open
> > that were previously deleted from disk.
> >
> > Du output agrees that the seen size is much larger than apparent-size
> size:
> >
> > root@kafkanode-1:/raid0/kafka-logs/measures-0# du -h
> > 00000000242666442619.kafka
> > 1.1G 00000000242666442619.kafka
> > root@kafkanode-1:/raid0/kafka-logs/measures-0# du -h --apparent-size
> > 00000000242666442619.kafka
> > 513M 00000000242666442619.kafka
> >
> >
> > Our log size/retention policy is:
> >
> > log.file.size=536870912
> > log.retention.hours=96
> >
> > We tried dropping the caches from the Stack Overflow suggestion (sync;
> echo
> > 3 > /proc/sys/vm/drop_caches) but that didn't seem to clear up the extra
> > space. Haven't had the chance to try remounting with the allocsize
> option.
> >
> > In summary, it would be great if Kafka would close FD's to log files that
> > hadn't been read from for some period of time if it addresses this issue.
> >
> > Cheers,
> >
> > Mike
> >
> > On Fri, Jul 26, 2013 at 5:03 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:
> >
> > > Cool, good to know.
> > >
> > >
> > > On Fri, Jul 26, 2013 at 2:00 PM, Jason Rosenberg <[EMAIL PROTECTED]>
> > wrote:
> > >
> > > > Jay,
> > > >
> > > > My only experience so far with this is using XFS.  It appears the XFS
> > > > behavior is evolving, and in fact, we see somewhat different behavior
> > > from
> > > > 2 of our CentOS kernel versions in use.  I've been trying to ask
> > > questions
> > > > about all this on the XFS.org mailing list, but so far, having not
> much
> > > > luck understanding the xfs versioning correlated to CentOS versions.
> > > >
> > > > Anyway, yes, I think it would definitely be worth trying the solution
> > you
> > > > suggest, which would be to close the file on rotation, and re-open
> > > > read-only.  Or to close files after a few hours of not being
> accessed.
> > > If
> > > > a patch for one of these approaches can be cobbled together, I'd love
> > to
> > > > test it out on our staging environment.  I'd be willing to experiment
> > > with
> > > > such a patch myself, although I'm not 100% of all the places to look
> > (but
> > > > might dive in).
> > > >
> > > > Xfs appears to the option of using dynamic, speculative
> preallocation,
> > in
> > > > which case it progressively doubles the amount of space reserved for
> a
> > > > file, as the file grows.  It does do this for all open files.  If the
> > > file
> > > > is closed, it will then release the preallocated space not in use.
> >  It's
> > > > not clear whether this releasing of space happens immediately on

 
+
Mike Heffner 2013-09-09, 21:07
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB