|
|
-
Re: what will happen when HDFS restarts but with some dead nodes
Nan Zhu 2013-01-30, 03:53
So, we can assume that all blocks are fully replicated at the start point of HDFS?
Best,
-- Nan Zhu School of Computer Science, McGill University
On Tuesday, 29 January, 2013 at 10:50 PM, Chen He wrote:
> Hi Nan > > Namenode will stay in safemode before all blocks are replicated. During this time, the jobtracker can not see any tasktrackers. (MRv1). > > Chen > > On Tue, Jan 29, 2013 at 9:04 PM, Nan Zhu <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> wrote: > > Hi, all > > > > I'm wondering if HDFS is stopped, and some of the machines of the cluster are moved, some of the block replication are definitely lost for moving machines > > > > when I restart the system, will the namenode recalculate the data distribution? > > > > Best, > > > > -- > > Nan Zhu > > School of Computer Science, > > McGill University > > > > >
-
Re: what will happen when HDFS restarts but with some dead nodes
Jean-Marc Spaggiari 2013-01-30, 14:58
Hi Nan,
When the Namenode will EXIT the safemode, you, you can assume that all blocks ARE fully replicated. If the Namenode is still IN safemode that mean that all blocks are NOT fully replicated.
JM
2013/1/29, Nan Zhu <[EMAIL PROTECTED]>: > So, we can assume that all blocks are fully replicated at the start point of > HDFS? > > Best, > > -- > Nan Zhu > School of Computer Science, > McGill University > > > > On Tuesday, 29 January, 2013 at 10:50 PM, Chen He wrote: > >> Hi Nan >> >> Namenode will stay in safemode before all blocks are replicated. During >> this time, the jobtracker can not see any tasktrackers. (MRv1). >> >> Chen >> >> On Tue, Jan 29, 2013 at 9:04 PM, Nan Zhu <[EMAIL PROTECTED] >> (mailto:[EMAIL PROTECTED])> wrote: >> > Hi, all >> > >> > I'm wondering if HDFS is stopped, and some of the machines of the >> > cluster are moved, some of the block replication are definitely lost >> > for moving machines >> > >> > when I restart the system, will the namenode recalculate the data >> > distribution? >> > >> > Best, >> > >> > -- >> > Nan Zhu >> > School of Computer Science, >> > McGill University >> > >> > >> > >
-
Re: what will happen when HDFS restarts but with some dead nodes
Chen He 2013-01-30, 15:28
That is correct if you do not manually exit NN safemode.
Regards
Chen On Jan 30, 2013 8:59 AM, "Jean-Marc Spaggiari" <[EMAIL PROTECTED]> wrote:
> Hi Nan, > > When the Namenode will EXIT the safemode, you, you can assume that all > blocks ARE fully replicated. If the Namenode is still IN safemode that > mean that all blocks are NOT fully replicated. > > JM > > 2013/1/29, Nan Zhu <[EMAIL PROTECTED]>: > > So, we can assume that all blocks are fully replicated at the start > point of > > HDFS? > > > > Best, > > > > -- > > Nan Zhu > > School of Computer Science, > > McGill University > > > > > > > > On Tuesday, 29 January, 2013 at 10:50 PM, Chen He wrote: > > > >> Hi Nan > >> > >> Namenode will stay in safemode before all blocks are replicated. During > >> this time, the jobtracker can not see any tasktrackers. (MRv1). > >> > >> Chen > >> > >> On Tue, Jan 29, 2013 at 9:04 PM, Nan Zhu <[EMAIL PROTECTED] > >> (mailto:[EMAIL PROTECTED])> wrote: > >> > Hi, all > >> > > >> > I'm wondering if HDFS is stopped, and some of the machines of the > >> > cluster are moved, some of the block replication are definitely lost > >> > for moving machines > >> > > >> > when I restart the system, will the namenode recalculate the data > >> > distribution? > >> > > >> > Best, > >> > > >> > -- > >> > Nan Zhu > >> > School of Computer Science, > >> > McGill University > >> > > >> > > >> > > > > >
-
Re: what will happen when HDFS restarts but with some dead nodes
Nitin Pawar 2013-01-30, 16:39
following are the configs it looks for . Unless Admin forces it to come out of safenode, it respects below values
dfs.namenode.safemode.threshold-pct0.999fSpecifies the percentage of blocks that should satisfy the minimal replication requirement defined by dfs.namenode.replication.min. Values less than or equal to 0 mean not to wait for any particular percentage of blocks before exiting safemode. Values greater than 1 will make safe mode permanent. dfs.namenode.safemode.min.datanodes0Specifies the number of datanodes that must be considered alive before the name node exits safemode. Values less than or equal to 0 mean not to take the number of live datanodes into account when deciding whether to remain in safe mode during startup. Values greater than the number of datanodes in the cluster will make safe mode permanent.dfs.namenode.safemode.extension30000Determines extension of safe mode in milliseconds after the threshold level is reached. On Wed, Jan 30, 2013 at 10:06 PM, Chen He <[EMAIL PROTECTED]> wrote:
> Hi Harsh > > I have a question. How namenode gets out of safemode in condition of data > blocks lost, only administrator? Accordin to my experiences, the NN (0.21) > stayed in safemode about several days before I manually turn safemode off. > There were 2 blocks lost. > > Chen > > > On Wed, Jan 30, 2013 at 10:27 AM, Harsh J <[EMAIL PROTECTED]> wrote: > >> NN does recalculate new replication work to do due to unavailable >> replicas ("under-replication") when it starts and receives all block >> reports, but executes this only after out of safemode. When in >> safemode, across the HDFS services, no mutations are allowed. >> >> On Wed, Jan 30, 2013 at 8:34 AM, Nan Zhu <[EMAIL PROTECTED]> wrote: >> > Hi, all >> > >> > I'm wondering if HDFS is stopped, and some of the machines of the >> cluster >> > are moved, some of the block replication are definitely lost for moving >> > machines >> > >> > when I restart the system, will the namenode recalculate the data >> > distribution? >> > >> > Best, >> > >> > -- >> > Nan Zhu >> > School of Computer Science, >> > McGill University >> > >> > >> >> >> >> -- >> Harsh J >> > > -- Nitin Pawar
-
Re: what will happen when HDFS restarts but with some dead nodes
Nan Zhu 2013-01-30, 16:45
I think Chen is asking replication lost,
so, according to Harsh's reply, in safe mode, NN will know all blocks which has less replications than 3(by default setup) but no less than 1, and after getting out from safe mode, it will instruct the real replicating works? Hope I understand it correctly
Best,
-- Nan Zhu School of Computer Science, McGill University
On Wednesday, 30 January, 2013 at 11:39 AM, Harsh J wrote:
> Yes, if there are missing blocks (i.e. all replicas lost), and the > block availability threshold is set to its default of 0.999f (99.9% > availability required), then NN will not come out of safemode > automatically. You can control this behavior by configuring > dfs.namenode.safemode.threshold. > > On Wed, Jan 30, 2013 at 10:06 PM, Chen He <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> wrote: > > Hi Harsh > > > > I have a question. How namenode gets out of safemode in condition of data > > blocks lost, only administrator? Accordin to my experiences, the NN (0.21) > > stayed in safemode about several days before I manually turn safemode off. > > There were 2 blocks lost. > > > > Chen > > > > > > On Wed, Jan 30, 2013 at 10:27 AM, Harsh J <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> wrote: > > > > > > NN does recalculate new replication work to do due to unavailable > > > replicas ("under-replication") when it starts and receives all block > > > reports, but executes this only after out of safemode. When in > > > safemode, across the HDFS services, no mutations are allowed. > > > > > > On Wed, Jan 30, 2013 at 8:34 AM, Nan Zhu <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> wrote: > > > > Hi, all > > > > > > > > I'm wondering if HDFS is stopped, and some of the machines of the > > > > cluster > > > > are moved, some of the block replication are definitely lost for moving > > > > machines > > > > > > > > when I restart the system, will the namenode recalculate the data > > > > distribution? > > > > > > > > Best, > > > > > > > > -- > > > > Nan Zhu > > > > School of Computer Science, > > > > McGill University > > > > > > > > > > > > > > > > > > > -- > > > Harsh J > > > > > > > > > > > > -- > Harsh J > >
-
Re: what will happen when HDFS restarts but with some dead nodes
Bertrand Dechoux 2013-01-30, 17:10
Well, the documentation is more explicite.
Specifies the percentage of blocks that should satisfy the minimal replication requirement defined by* dfs.namenode.replication.min*.
Which happens to be 1 by default but doesn't need to stay that way.
Regards
Bertrand
On Wed, Jan 30, 2013 at 5:45 PM, Nan Zhu <[EMAIL PROTECTED]> wrote:
> I think Chen is asking replication lost, > > so, according to Harsh's reply, in safe mode, NN will know all blocks > which has less replications than 3(by default setup) but no less than 1, > and after getting out from safe mode, it will instruct the real replicating > works? Hope I understand it correctly > > Best, > > -- > Nan Zhu > School of Computer Science, > McGill University > > > On Wednesday, 30 January, 2013 at 11:39 AM, Harsh J wrote: > > Yes, if there are missing blocks (i.e. all replicas lost), and the > block availability threshold is set to its default of 0.999f (99.9% > availability required), then NN will not come out of safemode > automatically. You can control this behavior by configuring > dfs.namenode.safemode.threshold. > > On Wed, Jan 30, 2013 at 10:06 PM, Chen He <[EMAIL PROTECTED]> wrote: > > Hi Harsh > > I have a question. How namenode gets out of safemode in condition of data > blocks lost, only administrator? Accordin to my experiences, the NN (0.21) > stayed in safemode about several days before I manually turn safemode off. > There were 2 blocks lost. > > Chen > > > On Wed, Jan 30, 2013 at 10:27 AM, Harsh J <[EMAIL PROTECTED]> wrote: > > > NN does recalculate new replication work to do due to unavailable > replicas ("under-replication") when it starts and receives all block > reports, but executes this only after out of safemode. When in > safemode, across the HDFS services, no mutations are allowed. > > On Wed, Jan 30, 2013 at 8:34 AM, Nan Zhu <[EMAIL PROTECTED]> wrote: > > Hi, all > > I'm wondering if HDFS is stopped, and some of the machines of the > cluster > are moved, some of the block replication are definitely lost for moving > machines > > when I restart the system, will the namenode recalculate the data > distribution? > > Best, > > -- > Nan Zhu > School of Computer Science, > McGill University > > > > > -- > Harsh J > > > > > -- > Harsh J > > >
|
|