-Re: live/dead node problem
Rita 2011-03-30, 00:13
what about for 0.21 ?
Also, where do you set this? in the data node configuration or namenode?
It seems the default is set to "3 seconds".
On Tue, Mar 29, 2011 at 5:37 PM, Ravi Prakash <[EMAIL PROTECTED]>wrote:
> I set these parameters for quickly discovering live / dead nodes.
> For 0.20 : heartbeat.recheck.interval
> For 0.22 : dfs.namenode.heartbeat.recheck-interval dfs.heartbeat.interval
> On 3/29/11 10:24 AM, "Michael Segel" <[EMAIL PROTECTED]> wrote:
> When the NameNode doesn't see a heartbeat for 10 minutes, it then
> recognizes that the node is down.
> Per the Hadoop online documentation:
> "Each DataNode sends a Heartbeat message to the NameNode periodically. A
> network partition can cause a
> subset of DataNodes to lose connectivity with the NameNode. The
> NameNode detects this condition by the
> absence of a Heartbeat message. The NameNode marks DataNodes
> without recent Heartbeats as dead and
> does not forward any new IO requests to them. Any data that was
> registered to a dead DataNode is not available to HDFS any more.
> DataNode death may cause the replication
> factor of some blocks to fall below their specified value. The
> NameNode constantly tracks which blocks need
> to be replicated and initiates replication whenever necessary. The
> necessity for re-replication may arise due
> to many reasons: a DataNode may become unavailable, a replica may
> become corrupted, a hard disk on a
> DataNode may fail, or the replication factor of a file may be
> I was trying to find out if there's an hdfs-site parameter that could be
> set to decrease this time period, but wasn't successful.
> > Date: Tue, 29 Mar 2011 08:13:43 -0400
> > Subject: live/dead node problem
> > From: [EMAIL PROTECTED]
> > To: [EMAIL PROTECTED]
> > Hello All,
> > Is there a parameter or procedure to check more aggressively for a
> > node? Despite me killing the hadoop process, I see the node active for
> > than 10+ minutes in the "Live Nodes" page. Fortunately, the last contact
> > increments.
> > Using, branch-0.21, 0985326
> > --
> > --- Get your facts first, then you can distort them as you please.--
--- Get your facts first, then you can distort them as you please.--