Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Is that a good practice?


Copy link to this message
-
Re: Is that a good practice?
Eric,

I shut it down at night, because the slave server is in my bedroom, and I
use the replication factor of 1, because that is what my CDH install did, so
I accepted it. I will bump it up to 3.

But the most important advice that you give is "put it into safe mode" - and
that is what I am going to do all the time that I am not working on it,
because it is purely my development cluster. I might even shut the daemons
down completely.

Thank you,
Mark

On Thu, Mar 3, 2011 at 5:55 PM, Eric Sammer <[EMAIL PROTECTED]> wrote:

> On Thu, Mar 3, 2011 at 6:44 PM, Mark Kerzner <[EMAIL PROTECTED]>wrote:
>
>> Hi,
>>
>> in my small development cluster I have a master/slave node and a slave
>> node,
>> and I shut down the slave node at night. I often see that my HDFS is
>> corrupted, and I have to reformat the name node and to delete the data
>> directory.
>>
>
> Why do you shut down the slave at night? HDFS should only be corrupted if
> you're missing all copies of a block. With a replication factor of 3
> (default) you should have 100% of the data on both nodes (if you only have 2
> nodes). If you've dialed it down to 1, simply starting the slave back up
> should "un-corrupt" HDFS. You definitely don't want to be doing this to HDFS
> regularly (dropping nodes from the cluster and re-adding them unless you're
> trying to test HDFS' failure semantics.
>
> It finally dawns on me that with such small cluster I better shut the
>> daemons down, for otherwise they are trying too hard to compensate for the
>> missing node and eventually it goes bad. Is my understanding correct?
>>
>
> It doesn't "eventually go bad." If the NN sees a DN disappear it may start
> re-replicating data to another node. In such a small cluster, maybe there's
> no where else to get the blocks from, but I bet you dialed the replication
> factor down to 1 (or have code that writes files with a rep factor of 1 like
> teragen / terasort).
>
> In short, if you're going to shut down nodes like this put the NN into safe
> mode so it doesn't freak out (which will also make the cluster unusable
> during that time) but there's definitely no need to be reformatting HDFS.
> Just re-introduce the DN you shut down to the cluster.
>
>
>>
>> Thank you,
>> Mark
>>
>
> --
> Eric Sammer
> twitter: esammer
> data: www.cloudera.com
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB