Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Re: What's the best disk configuration for hadoop? SSD's Raid levels, etc?


Copy link to this message
-
Re: What's the best disk configuration for hadoop? SSD's Raid levels, etc?
This sounds (with no real evidence) like you are a bit light on memory for
that number of cores.  That could cause you to be spilling map outputs
early and very much slowing things down.
On Fri, May 10, 2013 at 11:30 PM, David Parks <[EMAIL PROTECTED]>wrote:

> We’ve got a cluster of 10x 8core/24gb nodes, currently with 1 4TB disk (3
> disk slots max), they chug away ok currently, only slightly IO bound on
> average.****
>
> ** **
>
> I’m going to upgrade the disk configuration at some point (we do need more
> space on HDFS) and I’m thinking about what’s best hardware-wise:****
>
> ** **
>
> **·         **Would it be wise to use one of the three disk slots for a
> 1TB SSD?  I wouldn’t use it for HDFS, but for map-output and sorting it
> might make a big difference no?****
>
> **·         **If I put in either 1 or 2 more 4TB disks for HDFS, should I
> RAID-0 them for speed, or will HDFS balance well across multiple partitions
> on its own?****
>
> **·         **Would anyone suggest 3 4TB disks and a RAID-5 configuration
> to guard against disk replacements over the above options?****
>
> ** **
>
> Dave****
>
> ** **
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB