Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Using df instead of du to calculate datanode space

Copy link to this message
Re: Using df instead of du to calculate datanode space

Although I like the thought of doing things smarter I'm never ever
going to change core Unix/Linux applications for the sake of a
specific application. Linux handles scripts and binaries completely
different with regards to security. So how do you know for sure (I
mean 100% sure, not just 99.99999999% sure) that you haven't broken
any other functionality needed to keep your system sane?

Why don't you issue a feature request so this "needless disk io" can
be fixed as part of the base code of Hadoop (instead of breaking the
underlying OS)?


2011/5/21 Edward Capriolo <[EMAIL PROTECTED]>:
> Good job. I brought this up an another thread, but was told it was not a
> problem. Good thing I'm not crazy.
> On Sat, May 21, 2011 at 12:42 AM, Joe Stein
>> I came up with a nice little hack to trick hadoop into calculating disk
>> usage with df instead of du
>> http://allthingshadoop.com/2011/05/20/faster-datanodes-with-less-wait-io-using-df-instead-of-du/
>> I am running this in production, works like a charm and already
>> seeing benefit, woot!
>> I hope it works well for others too.
>> /*
>> Joe Stein
>> http://www.twitter.com/allthingshadoop
>> */

Met vriendelijke groeten,

Niels Basjes