So Allen, what do you use to monitor those processes/nodes?
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/
----- Original Message ----
> From: Allen Wittenauer <[EMAIL PROTECTED]>
> To: "<[EMAIL PROTECTED]>" <[EMAIL PROTECTED]>
> Sent: Wed, January 5, 2011 11:54:22 PM
> Subject: Re: monit? daemontools? jsvc? something else?
> On Jan 5, 2011, at 7:57 PM, Lance Norskog wrote:
> > Isn't this what Ganglia is for?
> Ganglia does metrics, not monitoring.
> > On 1/5/11, Allen Wittenauer <[EMAIL PROTECTED]> wrote:
> >> On Jan 4, 2011, at 10:29 PM, Otis Gospodnetic wrote:
> >>> Ah, more manual work! :(
> >>> You guys never have JVM die "just because"? I just had a DN's JVM die the
> >>> other day "just because and with no obvious cause". Restarting it
> >>> it
> >>> back to life, everything recovered smoothly. Had some automated tool
> >>> the
> >>> restart for me, I'd be even happier.
> >> In the case of Hadoop, no. There has usually been at least a core
> >> message in syslog, message in datanode log, etc, etc. [You *do* have
> >> enabled, right?]
> >> We also have in place a monitor that checks the # of active nodes. If
> >> falls below a certain percentage, then we get alerted and check on them en
> >> masse. Worrying about one or two nodes going down probably means you
> >> more nodes. :D
> > --
> > Lance Norskog
> > [EMAIL PROTECTED]