Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Chukwa >> mail # dev >> What constitute a successful project?


Copy link to this message
-
Re: What constitute a successful project?
Hi Eric,
Sorry to interfere at that point but I cannot let you using my name and
Netflix together for Chukwa.
I've designed Chukwa and I'm the main architect behind Chukwa, correct.

However Netflix is NOT running CHUKWA but HONU.
Honu is a stream based data collection that run at scale at Netflix and
other places.
I've designed Honu when I was at Netflix and Honu does not use CHUKWA code
anymore.
Honu code is a complete rewrite done by me and only me and that's the
reason why Honu scale
to more than 60 billions events/day.
People are still using the name Chukwa because it was the name I used for
my first presentation.
I've changed the name to Honu when I started the complete rewrite and you
are aware of that.

I'm the architect of both so there's some similarities but
Chukwa will never be Honu and I cannot let people think that they are.

I'll ask Kurt to update his presentation and to use the correct name: HONU
and not CHUKWA.

You can read more about Honu:
Here: http://www.slideshare.net/jboulon/hadoop-summit-2010-honu
or here:
http://www.slideshare.net/jboulon/cloud-connect-2012-big-data-netflix

Sorry Eric and next time you use my work please verify your sources or I'll
have to take
a more active role.

/Jerome Boulon
[EMAIL PROTECTED]
On Thu, Nov 29, 2012 at 10:39 PM, Eric Yang <[EMAIL PROTECTED]> wrote:

> Hi Jason,
>
> IBM is using Chukwa agent as the base of monitoring component for
> BigInsights.  The monitoring system share the same design principal, but
> has been custom built for BigInsights.  We wrote some generic adaptors to
> collect data from SNMP, JMX, and REST, which we are currently seeking
> approval from IBM to contribute back to open source.   BigInsights is IBM's
> distribution of Apache Hadoop.  We use it to monitor Hadoop and related
> technologies, and Chukwa is reliable and works well for us.
>
> Being able to have raw time series metrics and logs correlate events
> together.  Chukwa approach is definitely better than plain Ganglia and
> Nagios.  In Nagios and Ganglia combination, you only get facts after
> irreversible events have happened.  Such as jobtracker stop responding, or
> HBase region server died.  With raw data collected and analyzed, we can
> prevent irreversible events from happening.  For example, problematic job
> can be terminated before the job grow out of control.
>
> Netflix has a number of presentation talking about how they use Chukwa to
> stream data to EC2.  The most recent presentation is here:
>
> http://cdn.oreillystatic.com/en/assets/1/event/85/Netflix_s%20Evolving%20Data%20Science%20Architecture%20Presentation.pdf
>
> regards,
> Eric
>
> On Thu, Nov 29, 2012 at 5:54 AM, Dai, Jason <[EMAIL PROTECTED]> wrote:
>
> > Eric and the team,
> >
> > First, let me provide a little background about us. We at Intel have been
> > using Chukwa for building HiTune (a Hadoop performance analyzer
> > https://github.com/intel-hadoop/hitune), and one of our key team member,
> > Jie Huang, was recently accepted as a Chukwa committer (unfortunately she
> > was out sick since late September and has not been as active in the
> Chukwa
> > community as we would like).
> >
> > IMO, a key question for the Chukwa project is on how to grow the
> > community, and I believe an active developer community is driven by
> active
> > users.  It is unclear to me at this moment who are using Chukwa in their
> > daily work, what it is being used for, and how it can play an important
> > role in its target domain. I would suggest people on the list to share
> > their usage as the first step - How are you using Chukwa? Do you think
> > Chukwa is a good solution that can attract new users for that specific
> > problem?
> >
> > As a starter, I'll share our usage:
> > 1)      We have been using Chukwa to collect and aggregate performance
> > metric from Hadoop cluster, so that our tool HiTune can analyze
> performance
> > of Hadoop applications.
> > 2)      And as we outlined in CHUKWA-665, we have a prototype that uses

/Jerome
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB