Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # general >> one or more file system


Copy link to this message
-
Re: one or more file system
On Mon, Oct 8, 2012 at 8:30 AM, Xiang Hua <[EMAIL PROTECTED]> wrote:
> Hi,
>    we have 4T disk from a diskarray.
>    i want to split 2T*1 to 1T*2, then add to HDFS, which leads to more
> local storage directories.
>    this time we have 12 local directories(1T), is ti harmful to hdfs
> performance?

Assuming you're running a modern Hadoop on a recent Linux (2.6.38 or
later, or RHEL6):

For best performance you should configure your disk array as JBOD
rather than RAID, then put one ext4 filesystem on each spindle. Do not
put multiple storage directories on a single spindle, that results in
very bad performance and no benefit over a single storage directory
per spindle. And do not put multiple spindles under a single storage
directory, that results in poor utilization and bad performance with
no significant benefit.

12 local storage directories will perform just fine assuming you have
enough CPU power to use them.

-andy
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB