Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # general >> Dedicated disk for operating system


Copy link to this message
-
Re: Dedicated disk for operating system
Actually I was assuming that the entire cluster participates in the
rebalancing.  Repication is not done disk-wise in hadoop but block-wise.

On Wednesday, August 10, 2011, Rajiv Chittajallu <[EMAIL PROTECTED]>
wrote:
> Ted Dunning wrote on 08/10/11 at 10:40:30 -0700:
>>To be specific, taking a 100 node x 10 disk x 2 TB configuration with
drive
>>MTBF of 1000 days, we should be seeing drive failures on average once per
>>day.  With 1G ethernet and 30MB/s/node dedicated to re-replication, it
will
>>just over 10 minutes to restore replication of a single drive and will
take
>>just over 100 minutes to restore replication of an entire machine.
>
> You are assuming that only one good node is used to restore replication
for
> all the blocks on the failed drive. Which is very unlikely. With
> replication factor of 3, you will have at least 2 nodes to choose from
> in the worst case and much more in a standard cluster.
>
> And when you are having more spindles, 6+, one would probably consider
> using the second GigE port, which is standard on most of the commodity
> gear out there.
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB