Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Using df instead of du to calculate datanode space


Copy link to this message
-
Re: Using df instead of du to calculate datanode space
Hi,

Although I like the thought of doing things smarter I'm never ever
going to change core Unix/Linux applications for the sake of a
specific application. Linux handles scripts and binaries completely
different with regards to security. So how do you know for sure (I
mean 100% sure, not just 99.99999999% sure) that you haven't broken
any other functionality needed to keep your system sane?

Why don't you issue a feature request so this "needless disk io" can
be fixed as part of the base code of Hadoop (instead of breaking the
underlying OS)?

Niels

2011/5/21 Edward Capriolo <[EMAIL PROTECTED]>:
> Good job. I brought this up an another thread, but was told it was not a
> problem. Good thing I'm not crazy.
>
> On Sat, May 21, 2011 at 12:42 AM, Joe Stein
> <[EMAIL PROTECTED]>wrote:
>
>> I came up with a nice little hack to trick hadoop into calculating disk
>> usage with df instead of du
>>
>>
>> http://allthingshadoop.com/2011/05/20/faster-datanodes-with-less-wait-io-using-df-instead-of-du/
>>
>> I am running this in production, works like a charm and already
>> seeing benefit, woot!
>>
>> I hope it works well for others too.
>>
>> /*
>> Joe Stein
>> http://www.twitter.com/allthingshadoop
>> */
>>
>

--
Met vriendelijke groeten,

Niels Basjes
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB