Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - monit? daemontools? jsvc? something else?


Copy link to this message
-
Re: monit? daemontools? jsvc? something else?
Otis Gospodnetic 2011-01-06, 08:39
Hi,
----- Original Message ----
> From: Allen Wittenauer <[EMAIL PROTECTED]>

> > You guys never have JVM die "just because"? I  just had a DN's JVM die the
> > other day "just because and with no obvious  cause".  Restarting it brought
>it
>
> > back to life, everything  recovered smoothly.  Had some automated tool done
>the
>
> > restart for  me, I'd be even happier.
>
>     In the case of Hadoop,  no.  There has usually been at least a core dump,
>message in syslog,  message in datanode log, etc, etc.   [You *do* have cores
>enabled,  right?]

Hm, "cores enabled".... what do you mean by that?  Are you referring to JVM heap
dump -XX JVM argument (-XX:+HeapDumpOnOutOfMemoryError)?  If not, I'm all
eyes/ears!

>     We also have in place a monitor that checks  the # of active nodes.  If it
>falls below a certain percentage, then we get  alerted and check on them en
>masse.   Worrying about one or two nodes going  down probably means you need
>more nodes. :D
>

That's probably right. :)
So what do you use for monitoring the # of active nodes?

Thanks,
Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/