Ah, more manual work! :(
You guys never have JVM die "just because"? I just had a DN's JVM die the
other day "just because and with no obvious cause". Restarting it brought it
back to life, everything recovered smoothly. Had some automated tool done the
restart for me, I'd be even happier.
But I'll have to take your advice. :(
Anyone else has a different opinion?
Actually, is anyone actually using any such tools and *not* seeing problems when
they kick in and do their job of restarting dead processes?
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/
----- Original Message ----
> From: Brian Bockelman <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Sent: Tue, January 4, 2011 8:43:46 AM
> Subject: Re: monit? daemontools? jsvc? something else?
> I'll second this opinion. Although there are some tools in life that need to
>be actively managed like this (and even then, sometimes management tools can be
>set to be too aggressive, making a bad situation terrible), HDFS is not one.
> If the JVM dies, you likely need a human brain to log in and figure out what's
>wrong - or just keep that node dead.
> On Jan 3, 2011, at 10:40 PM, Allen Wittenauer wrote:
> > On Jan 3, 2011, at 2:22 AM, Otis Gospodnetic wrote:
> >> I see over on http://search-hadoop.com/?q=monit+daemontools that people *do*
> >> tools like monit and daemontools (and a few other ones) to keep revive
> >> Hadoop processes when they die.
> > I'm not a fan of doing this for Hadoop processes, even TaskTrackers and
>DataNodes. The processes generally die for a reason, usually indicating that
>something is wrong with the box. Restarting those processes may potentially