Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Chukwa, mail # user - multiple threads/HttpConnector from ChukwaAgent


Copy link to this message
-
Re: multiple threads/HttpConnector from ChukwaAgent
Corbin Hoenes 2012-01-26, 21:27
Eric,

We use chukwa for log aggregation of web servers and it powers our analytics pipeline.  It's been super useful and solid but we are running into a bit of a problem.  I was hoping to split my data stream and create a realtime pipeline w/hbase but also stream into HDFS for bach MR processing still.  

I am running some simple calculations on pageviews coming in and wanted to update hbase using counters.  This is slow right now since I only really have 1 servlet processing my chunk in my demo environment.  Without the realtime hbase counters in the pipeline data flows a couple order of magnitudes quicker--I was hoping that smaller chunks lots more collector servlets I could make it scale better but right now it slows down the data stream too much.

We use only 3 collectors in production and they handle the traffic well... but adding more would give us more concurrent hbase writer capability, was hoping there was a knob to allow for more concurrent chunk writing.
On Jan 26, 2012, at 1:03 PM, Eric Yang wrote:

> Hi Corbin,
>
> This is by design.  We are concatenating all data streams into in
> memory queue on the agent, and establish only one http connection to
> collector.  This is for horizontal scalability that we can support
> more machines.  At the same time, it also ensures that agent can write
> more data per HTTP post to reduce overhead of HTTP headers and
> connection handshakes.
>
> regards,
> Eric
>
> On Thu, Jan 26, 2012 at 11:51 AM, Corbin Hoenes <[EMAIL PROTECTED]> wrote:
>> I am trying to do some real-time processing of the data coming into my chukwa pipeline and notice that using a single agent I don't seem to be getting very many servlets handling the requests. Peeking at the ChukwaAgent code it looks like the agents are limited to a single HttpConnector.
>>
>> Is this by design or am I off-base in my analysis of how it works?
>>
>>