Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Chukwa, mail # user - Supercharging Chukwa


+
Eric Fiala 2010-08-13, 16:11
Copy link to this message
-
Re: Supercharging Chukwa
Ariel Rabkin 2010-08-13, 17:26
There are two knobs that, together, throttle the agent processes.

These are httpConnector.maxPostSize and httpConnector.minPostInterval

The maximum configured agent bandwidth is the ratio between those.  I
would try reducing the min post interval.  The defaults are, if I
remember right, something like 2 MB/ 5 seconds = 400 k/sec.   You can
crank that down a long ways.  Nothing should explode even if you set
it to 1 ms.

--Ari

On Fri, Aug 13, 2010 at 9:11 AM, Eric Fiala <[EMAIL PROTECTED]> wrote:
> Hello all,
> We would like to bring our production Chukwa (0.3.0) infrastructure to the
> next level.
> Currently, we have 5 machines generating 400GB per day (80GB in single log,
> per machine).
> These are using chukwa-agent CharFileTailingAdaptorUTF8.  Of
> note, chukwaAgent.fileTailingAdaptor.maxReadSize has been upped to 4000000.
>  We've left httpConnector.maxPostSize to default.
> The agents are sending to 3 chukwa-collectors which are simply gateways into
> HDFS (one also handles demux/processing - but this doesn't appear to be the
> wall... yet).  The agents have all three collectors listed in their conf.
> We are hitting walls somewhere, the whole 400GB is worked all the way into
> our repos over the course of the day, but during peeks we are falling
> upwards of 1-2 hours behind between being written to the tailed log and
> hitting hdfs://chukwa/logs as a .chukwa.
> Further we have observed that hdfs://chukwa/logs in our setup does not fill
> faster than 2GB per 5 minute period.  This is whether we use 2 chukwa
> collectors or 3.  This is further discouragement once foreseeable growth
> takes us to over ~ 575GB per day.
> All the machines are definitely not load bound, have noticed that chukwa was
> built with low resource utilization in mind - one thought is if this could
> be tweaked we could probably get more data through quicker.
> We have toyed with changing default Xmx or like value but don't want to
> start turning too many knobs before consulting the experts, considering all
> the pieces involved it's probably wise.  Scaling out is also an option, but
> I'm determined to squeeze x10 or more than current out of these multicore
> machines.
> Any suggestions are welcome,
> Thanks.
> EF

--
Ari Rabkin [EMAIL PROTECTED]
UC Berkeley Computer Science Department
+
Eric Yang 2010-08-16, 16:47
+
Eric Fiala 2010-08-17, 00:17