Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # dev >> Request for mentor for project on ACCUMULO-1197


Copy link to this message
-
Re: Request for mentor for project on ACCUMULO-1197
Over in HBase-land we've been starting to work in has some basic
dapper-like tracing capabilities in trunk today [1], based off of library
called htrace.[2]  We've been using it to break down and analyze and
eventually improve our mean-time-to-recovery on server failures.  I'm
certain that commits with this purpose in mind will be coming in the next
month or so.   A follow up to this would likely instrument the hdfs data
path if we get to the point where more delays are due to hdfs than hbase.

It is however, more of a toolkit currently -- it dumps the trace info as
json to local files and doens't "tracer" application like accumulo does,
and doesn't instrument all of the interesting internal processes inside of
hbase and hdfs yet.  There is some chatter however about integrating this
with zipkin (and using hbase as a backend for storage of traces) for trace
visualization and analysis. [3]

[1] http://hbase.apache.org/book/tracing.html
[2] https://github.com/cloudera/htrace
[3] https://github.com/twitter/zipkin
On Fri, Jul 12, 2013 at 5:52 PM, Keith Turner <[EMAIL PROTECTED]> wrote:

> On Thu, Jul 11, 2013 at 11:47 AM, Ajay Bhat <[EMAIL PROTECTED]> wrote:
>
> > Thanks Keith. I am looking into it now.
> >
> > Accumulo has done Dapper design based tracing and called it Cloudtrace. I
> > am not aware of the ins and outs of this and would like to know more
> > indepth about it. Can anyone help out here?
> >
>
> Cloudtrace is an implementation of the ideas mentioned in the Dapper paper.
>  Its only used by Accumulo at this point.   The reason I mentioned the
> HBase work is so you could asses the status of that work.  You should
> determine if any other work besides the HBase and Accumulo efforts exists.
>
> I can think of a few ways to tackle this problem.
>
>  1. Modify HDFS to use cloudtrace.  Cloudtrace is currently implemented on
> top of thrift, HDFS does not use thrift.
>  2. Modify HDFS and Accumulo to use HBase  tracing.
>  3. Modify HDFS to support hooks for tracing.  Make cloudtrace use these
> hooks.
>  4. Create completely new tracing system
>
> What are your thoughts on this project?  What do you hope to accomplish?
> What is your timeframe?
>
>
> > I have checked out the HBase trunk also.
> >
> >  Some additional info about ASF ICFOSS: The ASF-ICFOSS Programme [
> > http://icfoss.org/mentor.html] was conducted by Luciano Resende [
> > http://people.apache.org/~lresende] on June 21-23rd, 2013 and as part of
> > the Programme, students are encouraged to take up an Apache JIRA as a
> > project.
> >
> >
> > On Thu, Jul 11, 2013 at 7:27 PM, Keith Turner <[EMAIL PROTECTED]> wrote:
> >
> > > Some HBase guys were also looking into doing something w/ tracing.  I
> am
> > > not sure what has been done, but it would be useful to look into that.
>  I
> > > linked HBASE-6449 to the Accumulo ticket.
> > >
> > >
> > > On Wed, Jul 10, 2013 at 8:41 PM, Ajay Bhat <[EMAIL PROTECTED]>
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I had attended the ASF-ICFOSS workshop held in June.
> > > > http://community.apache.org/mentoringprogramme-icfoss-pilot.html
> > > >
> > > > I'd like to take up this issue as a project.
> > > > https://issues.apache.org/jira/browse/ACCUMULO-1197
> > > > Could someone in the community act as a mentor for this project? The
> > > > process of selection is very similar to GSoC.
> > > >
> > > > I've checked out the source code and used Accumulo. I hope to have
> > made a
> > > > formal proposal by Saturday, July 13th. I'd be happy to get some
> advice
> > > on
> > > > how I could get started on it, and also any minor issues I could work
> > on
> > > in
> > > > the interim.
> > > >
> > > > Regards,
> > > > Ajay
> > > >
> > >
> >
>

--
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// [EMAIL PROTECTED]