Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Avro sink to source is too slow

Copy link to this message
Re: Avro sink to source is too slow
Just a quick update, I found two issues that slowed down flume:
1. Using 3 file replicating channels on the avro source slowed down the
acceptance of flume events, it takes up to 5-10  times more than writing to
one channel. So I'm now trying to change the collector's configuration to 1
file channel and then a spooldir source that will read out of the
Collector's file system and into a memory channel for replication.
2. More disturbing is that I see many disconnections in Avro Sink-Source
pair while the Source flume (e.g. collector) is doing Full GCs, also the
Full GCs were quite long (~ 15 seconds). Changing Java to a non-hanging GC
(i.e. gc1) solved this issue as well.

BTW Regarding Mike's question above:
What is the correct way to put multiple threads that will drain a channel
I thought the correct way is simply to put multiple sinks that are using
the same channel, without any sink groups, is that correct?

On Tue, Oct 1, 2013 at 11:10 PM, Roshan Naik <[EMAIL PROTECTED]> wrote:

> My thoughts...You have 4 sinks draining the same channel and each has a
> batch size 1000. Since they will contend on the same channel & *assuming*
> events are evenly distributed among the sinks, there is potential for some
> starvation happening in the sinks as their batch sizes may not be reached
> until about 4 batches  are inserted by the source. I dont know if there is
> a good thumb rule here.
> try these:
> -  See if sink batch size of 250 helps.
> -  Using a single avro sink instead of 4 with batch size of 1k.
> -  Replacing the  avro sink with the null sink on the first agent and take
> a measurement. it would be good to ensure spool source is not the bottle
> neck.
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.