Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Flume latency issue


+
Karthikeyan Muthukumarasa... 2012-09-27, 13:55
Copy link to this message
-
Re: Flume latency issue
MK,
Which version of Flume are you using?

This design sounds reasonable to me.

Regarding your question about 20 second latency, that should not be true.
Can you please confirm that you have observed this? If you have, please
attach your flume.conf and we can take a look. But it should be nearly
instantaneous - the batch sizes are controlled by the client or sink, and
in general the sink takes as much as it can from the channel, up to the
batch size, but if it takes less we continue immediately regardless. The
only time we "back off" is when the channel is empty *and* no events were
taken in the current batch - then the sink runner goes into an exponential
backoff.

Regards,
Mike

On Thu, Sep 27, 2012 at 6:55 AM, Karthikeyan Muthukumarasamy <
[EMAIL PROTECTED]> wrote:

> Hi,
> In my project various applications and 3PPs write log into to their
> separate logfiles.
> There are two limitations with this:
> - the structure of the log messages are different in each log file
> - the log messages are in different files and I cant get a single time
> sorted display of all log messages, which is important in some debug
> situations
>
> As a solution to this problem, I intend to:
> - use separate flume sources to tail various log files in the system
> - have interceptors for each type of flume source and convert all log
> messages to a common structure
> - all flume sinks will write to a localhost avro port
> - a separate flume source will read from the avro port on localhost
> - there will be a fan-out logic to post the data from that source to
> multiple channels
> - each connel is connected to a separate sink like JMX sink, HBase Sink etc
>
> First of all, is this kind of usage of flume acceptable and is there
> anything I need to specifically take care of?
>
> I also notice that the consolidated avro source which reads data from avro
> port gets data only as blocks from each source, the latency is around 20
> seconds. Is it possible to reduce this latency, so at the consolidated avro
> source, I receive all events as they are getting logged into their log
> files, instantaneously?
>
> Thanks in advance!
> MK
>
>
+
Karthikeyan Muthukumarasa... 2012-09-28, 04:23
+
Mike Percy 2012-09-28, 22:01