Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> monit? daemontools? jsvc? something else?

Copy link to this message
Re: monit? daemontools? jsvc? something else?
Ah, more manual work! :(

 You guys never have JVM die "just because"?  I just had a DN's JVM die the
other day "just because and with no obvious cause".  Restarting it brought it
back to life, everything recovered smoothly.  Had some automated tool done the
restart for me, I'd be even happier.

But I'll have to take your advice. :(

Anyone else has a different opinion?
Actually, is anyone actually using any such tools and *not* seeing problems when
they kick in and do their job of restarting dead processes?

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/

----- Original Message ----
> From: Brian Bockelman <[EMAIL PROTECTED]>
> Sent: Tue, January 4, 2011 8:43:46 AM
> Subject: Re: monit? daemontools? jsvc? something else?
> I'll second this opinion.  Although there are some tools in life that need  to
>be actively managed like this (and even then, sometimes management tools can  be
>set to be too aggressive, making a bad situation terrible), HDFS is not  one.
> If the JVM dies, you likely need a human brain to log in and figure  out what's
>wrong - or just keep that node dead.
> Brian
> On Jan 3,  2011, at 10:40 PM, Allen Wittenauer wrote:
> >
> > On Jan 3, 2011,  at 2:22 AM, Otis Gospodnetic wrote:
> >> I see over on http://search-hadoop.com/?q=monit+daemontools that people *do*
> >> tools like monit and daemontools (and a few other ones) to keep  revive
> >> Hadoop processes when they die.
> >>
> >
> >     I'm not a fan of doing this for Hadoop processes,  even TaskTrackers and
>DataNodes.  The processes generally die for a reason,  usually indicating that
>something is wrong with the box.  Restarting those  processes may potentially
>hide issues.