Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # dev - GSOC: Monitor Improvements


Copy link to this message
-
Re: GSOC: Monitor Improvements
Keith Turner 2013-04-22, 17:50
On Mon, Apr 22, 2013 at 12:42 PM, Supun Kamburugamuva <[EMAIL PROTECTED]>wrote:

> Great.. we could certainly introduce the graph Mike and Keith have
> mentioned.
>

I mentioned that it would be useful to display info collected from clients.
 Tracing already collects this info.  The graph Mike mentioned may be
useful for displaying trace info, maybe a plot per a trace field.
>
> Supun..
>
>
> On Mon, Apr 22, 2013 at 12:02 PM, Keith Turner <[EMAIL PROTECTED]> wrote:
>
> > On Mon, Apr 22, 2013 at 11:42 AM, Mike Drob <[EMAIL PROTECTED]> wrote:
> >
> > > Adding on to the comment about summaries, averages, and outliers. If,
> for
> > > some reason, you end up with a two-hump population, then simply showing
> > > averages will mask the split and lose a lot of valuable information. It
> > is
> > > often valuable to know that a particular set of users or servers are
> > > experiencing degraded performance while the rest of the ecosystem is
> > > healthy.
> > >
> > > This isn't something that shows up in a regular time series because the
> > > secondary population is usually very small compared to the total
> > > population. There was a graph for request latency of a service that I
> saw
> > > once that I really wish I could find again, maybe somebody on the list
> > will
> > > be able to chime in - It had timestamps on the x-axis, latency on the
> y,
> > > and each (x,y) point was colored on a gradient representing how many
> > > requests were fulfilled at time x with latency y. This chart make it
> > > immediately easy to see that most data points fit a normal distribution
> > > with a low mean, but there was also a cluster at the top for some
> reason.
> > >
> >
> >
> > That sounds really cool.  Maybe the y-axis/latency could be log scale.
> > Inevitably a 3004 second operation will finish and obscure the
> > smaller latencies.
> >
> > Sometimes its more useful to sample this type of info from the clients
> > rather than tablet servers.   A tablet server may report low latencies,
> but
> > all clients using may experience high latencies because of a network
> issue.
> >   We could certainly consider making the client code report this info.
> >
> >
> > >
> > > I'd love to see that type of chart show up for tablet servers (probably
> > not
> > > as useful for tables).
> > >
> > > Mike
> > >
> > >
> > > On Mon, Apr 22, 2013 at 9:05 AM, Eric Newton <[EMAIL PROTECTED]>
> > > wrote:
> > >
> > > > Another thing to consider is scale.  On large clusters (many hundreds
> > of
> > > > nodes), more data is not helpful for visualization.  Instead,
> > summaries,
> > > > averages and outliers are important.
> > > >
> > > > For example, if one node is consistently slow, it is better to know
> > that
> > > > than to see one graph with low numbers in a sea of graphs.
> > > >
> > > > If the monitor collects information using JMX, collection time for
> each
> > > > node would be a good thing to know, too.
> > > >
> > > > -Eric
> > > >
> > > >
> > > > On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <[EMAIL PROTECTED]>
> > > wrote:
> > > >
> > > > > Supun,
> > > > >
> > > > > Yup, very much so. Having a way to consume any and all metrics via
> > JMX
> > > > > would simplify things for any consumers (internal or external).
> > > > >
> > > > >
> > > > >
> > > > > On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
> > > > >
> > > > >> Hi Josh,
> > > > >>
> > > > >> Thanks for the suggestions. I'll incorporate these to the
> proposal.
> > > > >>
> > > > >> Another area I would like to work is on JMX. There is a Jira that
> > says
> > > > to
> > > > >> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do
> you
> > > > think
> > > > >> this is a good addition to the Monitor?
> > > > >>
> > > > >> Thanks,
> > > > >> Supun..
> > > > >>
> > > > >>
> > > > >> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <[EMAIL PROTECTED]
> >
> > > > wrote:
> > > > >>
> > > > >>  Supun,
> > > > >>>
> > > > >>> Looks good! Can I make some suggestions/comments?
> >