Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> managing 5-10 servers


Copy link to this message
-
Re: managing 5-10 servers
So you have 20 nodes for the stumbled upon link redirection service?

Are there any blog posts that go over the setup and what sort of read/write
traffic it gets?  Is there a memcached layer that sites in front?

On Tue, Nov 23, 2010 at 4:44 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote:

> I wish I could do a dump of my memory into an ops guide to HBase, but
> currently I don't think there's such a writeup.
>
> What can go wrong... again it depends on your type of usage. With a
> MR-heavy cluster, it's usually very easy to drive the IO wait through
> the roof and then you'll end up with GC pauses >60 secs caused by CPU
> starvation. Here's a recent example we got when a big Mahout job was
> running:
>
> 2010-11-19T18:25:31.173-0800: [GC [ParNew: 114456K->13056K(118016K),
> 103.8190010 secs] 4624541K->4535473K(7154944K), 104.7165690 secs]
> [Times: user=4.45 sys=2.02, real=104.72 secs]
>
> The trained eye will quickly see that something very bad happened on
> that cluster. Indeed, during post-mortem we saw that somehow that
> machine started swapping which is the Worst Thing Ever (tm) that can
> happen to a machine that runs java processes. Make sure that your
> memory usage always stay under your total memory, even when all the
> mappers and reducers are using their heap at the fullest. And then
> double check that (which it seems we didn't do).
>
> On a cluster that serves web traffic, and thus must not be MRed
> against, you get the "usual" stuff like bad disks and operator errors.
>
> J-D
>
> On Tue, Nov 23, 2010 at 1:31 PM, S Ahmed <[EMAIL PROTECTED]> wrote:
> > Are there any writeups on what things to look for?
> >
> > What are some of the things that usually go wrong? Or is that an unfair
> > question :)
> >
> > On Tue, Nov 23, 2010 at 4:22 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]
> >wrote:
> >
> >> Constant hand holding no, constant monitoring yes. Do setup Ganglia
> >> and preferably Nagios. Then it depends what you're planning to do with
> >> your cluster... here we have 2x 20 machines in production, the one
> >> that serves live traffic is pretty much doing it's own thing by itself
> >> (although I keep a ganglia tab opened on a second monitor) and the
> >> other one is used strictly for MapReduce for which our internal users
> >> have developed a habit of running very destructive jobs on. But to be
> >> fair, it's probably the users that need support the most ;)
> >>
> >> J-D
> >>
> >> On Tue, Nov 23, 2010 at 1:14 PM, S Ahmed <[EMAIL PROTECTED]> wrote:
> >> > Hi,
> >> >
> >> > How much of a guru do you have to be to keep say 5-10 servers humming?
> >> >
> >> > I'm a 1-man shop, and I dream of developing a web application, and
> >> scaling
> >> > will be a core part of the application.
> >> >
> >> > Is it feasable for a 1-man operation to manage a 5-10 server hbase
> >> cluster?
> >> > Is it something that requires hand holding and constant monitoring or
> it
> >> > tends to be hands off?
> >> >
> >>
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB