Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Chukwa >> mail # user >> Agent and collector


Copy link to this message
-
Re: Agent and collector
Yes, the agent-->collector path is HTTP.

This was done precisely to allow load balancers. I don't know how
tested that configuration is, though. I think most sites had Chukwa
itself do the load balancing by specifying multiple collectors.

There is a notion of end-to-end reliability; the so-called
asynchronous ack mechanism. It's off by default and hasn't been tried
much in production. See
http://www.usenix.org/events/lisa10/tech/full_papers/Rabkin.pdf for
the detailed design of it.

--Ari

On Fri, Jul 29, 2011 at 11:04 AM, T. A. Smooth <[EMAIL PROTECTED]> wrote:
> Hello I am checking out Chukwa. I have a few questions I was hoping the mail
> list could answer :-)
>
> 1)Does Chukwa agents communicate to collectors over http? Or some other
> protocol?
>
> The agent configuration makes me believe that:
> http://incubator.apache.org/chukwa/docs/r0.4.0/admin.html#Configuration
>
> 2) And the docs it seems an Agent will pick a collector at random and then
> use that collect until there is a problem in communicating with it. How do
> you think the agent/collector would act if they have a load balancer between
> them? For example, the agent configuration would have just one url
> http://collector-loadbalancer. example.com:8080/
>
> The load balancer would have 1 or more collectors behind it saving the
> chunks it receives to disk or hadoop.
>
> 3) Does chukwa have any “end-to-end” reliability features for message
> delivery? For example, a collector may receive the chunk from the agent but
> it may have a problem writing it to the data store. (ie. Disk space full,
> connection to hadoop down) . Will the agent be notified that the chunk was
> not processed for a certain reason and the agent is told to cache to disk
> the missed message?
>
> Thanks for the info!
>
> -tp-

--
Ari Rabkin [EMAIL PROTECTED]
UC Berkeley Computer Science Department
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB