Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Re: increase load on tier2 flume agents


+
Jan Van Besien 2013-11-07, 17:35
Copy link to this message
-
increase load on tier2 flume agents
Hi,

I have a 2 tier flume setup. Tier 1 are agents that accept incomming
requests (http source) and put them on (large) file channels. Tier 2
does a lot of processing on these events (with custom interceptors) and
a custom sink to store the result in a custom data storage. These tier 2
agents use a (small) memory channel.

The tier 2 interceptors and data storage are all mostly IO bound.

I seem to struggle to saturate the tier 2 agents. They are slower than
they should be, mostly due to various flume unrelated reasons.

However, assume that I would like my tier 2 agents to process more
events in parallel. What would be the appropriate way to do this?

Do I need multiple avro sinks on the tier 1 agents that map to the same
tier 2 avro source? I tried this, and this seems to increase the number
of threads on the tier 2 agent that are actually processing events indeed.

Is this the way to do it, or not?

thanks,
Jan

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB