Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Optimizing Disk I/O - does HDFS do anything ?


Copy link to this message
-
Re: Optimizing Disk I/O - does HDFS do anything ?
On Tue, Nov 13, 2012 at 1:40 PM, Jay Vyas <[EMAIL PROTECTED]> wrote:
> 1) but I thought that this sort of thing (yes even on linux) becomes
> important when you have large amounts of data - because the way files are
> written can cause issues on highly packed drives.

If you're running any filesystem at 99% full with a workload that
creates or grows files, the filesystem will experience fragmentation.
Don't do that if you want good performance.

As long as there's a few dozen GB of free space to work with, ext4 on
a modern Linux kernel (2.6.38 or newer) will do a fine job of keeping
files sequential and shouldn't need defrag.

To answer the original question -- HDFS doesn't take any special
measures to enforce defragmentation, but HDFS does follow best
practices to avoid causing fragmentation.

-andy
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB