Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> live/dead node problem


Copy link to this message
-
Re: live/dead node problem
what about for 0.21 ?

Also, where do you set this? in the data node configuration or namenode?
It seems the default is set to "3 seconds".

On Tue, Mar 29, 2011 at 5:37 PM, Ravi Prakash <[EMAIL PROTECTED]>wrote:

>  I set these parameters for quickly discovering live / dead nodes.
>
> For 0.20 : heartbeat.recheck.interval
> For 0.22 : dfs.namenode.heartbeat.recheck-interval dfs.heartbeat.interval
>
> Cheers,
> Ravi
>
>
> On 3/29/11 10:24 AM, "Michael Segel" <[EMAIL PROTECTED]> wrote:
>
>
>
> Rita,
>
> When the NameNode doesn't see a heartbeat for 10 minutes, it then
> recognizes that the node is down.
>
> Per the Hadoop online documentation:
> "Each DataNode sends a Heartbeat message to the NameNode periodically. A
> network partition can cause a
>         subset of DataNodes to lose connectivity with the NameNode. The
> NameNode detects this condition by the
>         absence of a Heartbeat message. The NameNode marks DataNodes
> without recent Heartbeats as dead and
>         does not forward any new IO requests to them. Any data that was
>         registered to a dead DataNode is not available to HDFS any more.
> DataNode death may cause the replication
>         factor of some blocks to fall below their specified value. The
> NameNode constantly tracks which blocks need
>         to be replicated and initiates replication whenever necessary. The
> necessity for re-replication may arise due
>         to many reasons: a DataNode may become unavailable, a replica may
> become corrupted, a hard disk on a
>         DataNode may fail, or the replication factor of a file may be
> increased.
>         "
>
> I was trying to find out if there's an hdfs-site parameter that could be
> set to decrease this time period, but wasn't successful.
>
> HTH
>
> -Mike
>
>
> ----------------------------------------
> > Date: Tue, 29 Mar 2011 08:13:43 -0400
> > Subject: live/dead node problem
> > From: [EMAIL PROTECTED]
> > To: [EMAIL PROTECTED]
> >
> > Hello All,
> >
> > Is there a parameter or procedure to check more aggressively for a
> live/dead
> > node? Despite me killing the hadoop process, I see the node active for
> more
> > than 10+ minutes in the "Live Nodes" page. Fortunately, the last contact
> > increments.
> >
> >
> > Using, branch-0.21, 0985326
> >
> > --
> > --- Get your facts first, then you can distort them as you please.--
>
>
>
--
--- Get your facts first, then you can distort them as you please.--