Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> How to fix a corrupted disk?


+
Sean Bigdatafun 2010-06-10, 05:13
Copy link to this message
-
Re: How to fix a corrupted disk?

On Jun 9, 2010, at 10:13 PM, Sean Bigdatafun wrote:

> I have two questions here about a HDFS cell. Suppose the file that I am interested is stored on 3 datanodes A, B, C. And A suddenly crashed, I understand I can still read my file because I have two copies available at this moment. But my question is which software module is responsible to bring A back to running? (is there a watchdog server?)
>  

No, there is not a watchdog.  Each installation is slightly different and (almost) every OS provides facilities to guarantee a daemon is continually running.  [SMF, launchd, daemontools, etc.].   In most installations, I suspect wetware is used to bring back dead datanode processes so that the reason of the crash can be investigated.

> Furthermore, if the disk on server A is totally corrupted (disk failure), what should I do to bring my file back to 3 replication mode?

Fix the disk on A and restart the datanode process.

When you have more than 3 datanodes, the namenode will automatically replicate any under-replicated blocks if there is a node that is qualified to do so.  [In other words, if you have a grid large enough to support topology, the namenode will not violate topology just to replicate a block.  It is expected that there are enough nodes in enough racks to not cause policy violations.]
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB