Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Optimizing Disk I/O - does HDFS do anything ?

Jay Vyas 2012-11-13, 20:30
Bertrand Dechoux 2012-11-13, 21:10
Scott Carey 2012-11-17, 07:27
Jay Vyas 2012-11-13, 21:40
Copy link to this message
Re: Optimizing Disk I/O - does HDFS do anything ?
On Tue, Nov 13, 2012 at 1:40 PM, Jay Vyas <[EMAIL PROTECTED]> wrote:
> 1) but I thought that this sort of thing (yes even on linux) becomes
> important when you have large amounts of data - because the way files are
> written can cause issues on highly packed drives.

If you're running any filesystem at 99% full with a workload that
creates or grows files, the filesystem will experience fragmentation.
Don't do that if you want good performance.

As long as there's a few dozen GB of free space to work with, ext4 on
a modern Linux kernel (2.6.38 or newer) will do a fine job of keeping
files sequential and shouldn't need defrag.

To answer the original question -- HDFS doesn't take any special
measures to enforce defragmentation, but HDFS does follow best
practices to avoid causing fragmentation.