-Re: monit? daemontools? jsvc? something else?
Otis Gospodnetic 2011-01-06, 08:39
----- Original Message ----
> From: Allen Wittenauer <[EMAIL PROTECTED]>
> > You guys never have JVM die "just because"? I just had a DN's JVM die the
> > other day "just because and with no obvious cause". Restarting it brought
> > back to life, everything recovered smoothly. Had some automated tool done
> > restart for me, I'd be even happier.
> In the case of Hadoop, no. There has usually been at least a core dump,
>message in syslog, message in datanode log, etc, etc. [You *do* have cores
Hm, "cores enabled".... what do you mean by that? Are you referring to JVM heap
dump -XX JVM argument (-XX:+HeapDumpOnOutOfMemoryError)? If not, I'm all
> We also have in place a monitor that checks the # of active nodes. If it
>falls below a certain percentage, then we get alerted and check on them en
>masse. Worrying about one or two nodes going down probably means you need
>more nodes. :D
That's probably right. :)
So what do you use for monitoring the # of active nodes?
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/