Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Is that a good practice?


Copy link to this message
-
Re: Is that a good practice?
Eric,

I shut it down at night, because the slave server is in my bedroom, and I
use the replication factor of 1, because that is what my CDH install did, so
I accepted it. I will bump it up to 3.

But the most important advice that you give is "put it into safe mode" - and
that is what I am going to do all the time that I am not working on it,
because it is purely my development cluster. I might even shut the daemons
down completely.

Thank you,
Mark

On Thu, Mar 3, 2011 at 5:55 PM, Eric Sammer <[EMAIL PROTECTED]> wrote:

> On Thu, Mar 3, 2011 at 6:44 PM, Mark Kerzner <[EMAIL PROTECTED]>wrote:
>
>> Hi,
>>
>> in my small development cluster I have a master/slave node and a slave
>> node,
>> and I shut down the slave node at night. I often see that my HDFS is
>> corrupted, and I have to reformat the name node and to delete the data
>> directory.
>>
>
> Why do you shut down the slave at night? HDFS should only be corrupted if
> you're missing all copies of a block. With a replication factor of 3
> (default) you should have 100% of the data on both nodes (if you only have 2
> nodes). If you've dialed it down to 1, simply starting the slave back up
> should "un-corrupt" HDFS. You definitely don't want to be doing this to HDFS
> regularly (dropping nodes from the cluster and re-adding them unless you're
> trying to test HDFS' failure semantics.
>
> It finally dawns on me that with such small cluster I better shut the
>> daemons down, for otherwise they are trying too hard to compensate for the
>> missing node and eventually it goes bad. Is my understanding correct?
>>
>
> It doesn't "eventually go bad." If the NN sees a DN disappear it may start
> re-replicating data to another node. In such a small cluster, maybe there's
> no where else to get the blocks from, but I bet you dialed the replication
> factor down to 1 (or have code that writes files with a rep factor of 1 like
> teragen / terasort).
>
> In short, if you're going to shut down nodes like this put the NN into safe
> mode so it doesn't freak out (which will also make the cluster unusable
> during that time) but there's definitely no need to be reformatting HDFS.
> Just re-introduce the DN you shut down to the cluster.
>
>
>>
>> Thank you,
>> Mark
>>
>
> --
> Eric Sammer
> twitter: esammer
> data: www.cloudera.com
>