Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # dev >> GSOC: Monitor improvements - draft proposal


+
Supun Kamburugamuva 2013-04-29, 14:46
+
Josh Elser 2013-04-30, 02:26
+
Supun Kamburugamuva 2013-04-30, 14:31
Copy link to this message
-
Re: GSOC: Monitor improvements - draft proposal
I've submitted the proposal to google.

http://www.google-melange.com/gsoc/proposal/review/google/gsoc2013/supun06/1

Thanks,
Supun..
On Tue, Apr 30, 2013 at 10:31 AM, Supun Kamburugamuva <[EMAIL PROTECTED]>wrote:

> Hi Josh,
>
> Thanks for the detailed feedback. I've integrated your suggestions to the
> document.
>
> For a third party monitoring tool I would like to use Zabbix. But I may
> have to do more research on this one. For now I'll leave it with Zabbix.
>
> I couldn't find a library that helps with JMX development. I guess most
> people use the Java API directly. Anyways I'll do more research on this one
> and try to find one if possible.
>
> Thanks for the Javascrip library suggestions. d3.js seems impressive.
>
> I've updated the timeline and added a deliverable section. Hope this helps
> a bit. Let me know if it needs further improvements. For each phase my plan
> is to create a patch with the changes.
>
> Thanks,
> Supun..
>
>
>
>
> On Mon, Apr 29, 2013 at 10:26 PM, Josh Elser <[EMAIL PROTECTED]> wrote:
>
>> Supun,
>>
>> Thanks for the draft! Some feedback -- hopefully it's useful for your
>> proposal in addition to giving you a better understanding of how Accumulo
>> is typically run.
>>
>> "These servers perform different functionalities"
>>
>> Actually, most serversin an Accumulo cluster are identical to one
>> another: most are running a TabletServer, and in <1.5, a Logger. The
>> exceptions are the Master, Monitor, Tracer and GarbageCollector. Master,
>> monitor and gc are typically run on the same node (monitor and gc are
>> rather lightweight). Running a tracer on every TabletServer is probably
>> overkill, but, again, this is another lightweight process, so not outside
>> the realm of possibilities.
>>
>> "Create a JMX API for Monitor to gather statistics"
>>
>> Any plans to include an example 3rd-party monitor that takes advantage of
>> the internal change from Thrift to JMX? If so, which? I could see this
>> being very useful for your own verification and validation, not to mention
>> for 3rd parties (people other than yourself).
>>
>> "Table Graphs"
>>
>> I'd be rather interested to see how the amount of data being returned by
>> a TabletServer correlates with query rate. It would be a neat plot to see
>> how RFile index size and size of each key-value returned corresponds with
>> query rate. Maybe it would be cool to have the ability to let users create
>> composite graphs?
>>
>> "Trace Visualization"
>>
>> Not a lot to really see here. Currently you get some rudimentary
>> information about how long it took to determine which files to delete, and
>> how long deleting them took (I think). It would be nice to see this broken
>> down by table, and include file size and other file metadata.
>>
>> "Server Status Information"
>>
>> I remember hearing that someone had done some work to actually pop a
>> shell in the monitor when authenticated over HTTPS. Another cool feature
>> might be to actually have some greater insight into a node (perhaps using
>> JMX calls that we wouldn't want publicly available) when properly
>> authenticated? I'm thinking about being able to view the list of running
>> scans on a node... being able to introspect the actual scan options/data,
>> ranges being run, etc.
>>
>> "Mock Stats Collector"
>>
>> I would put money that this will pay off in spades as you move forward
>> testing things.
>>
>> Some more high-level things...
>>
>> * Any thought/preference on the JMX library you would want to use?
>> * Re: Javascript, might want to look at DataTables (jQuery-based), d3.js,
>> and/or nvd3. Lots of options here, but licensing can be a concern. Glad you
>> thought about that already.
>>
>> "Deliverables and Timeline"
>>
>> I'd try to rethink your timeline a bit; it comes off very waterfall-y to
>> me. The biggest red-flag to me is the "write documentation" as your last
>> phase. Coming from experience, this doesn't work 95% of the time. Something
>> else always comes up, takes longer, w/e and suddenly you have some code
Supun Kamburugamuva
Member, Apache Software Foundation; http://www.apache.org
E-mail: [EMAIL PROTECTED];  Mobile: +1 812 369 6762
Blog: http://supunk.blogspot.com
+
Josh Elser 2013-05-03, 14:41
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB