Yes I am using memory channel. The boxes have 16GB RAMs, we're running with
8GB heap each.
Each memory channel capacity is 1 Million (4 Sinks so 4 Million in total),
and transaction size is 10K per sink.
Batch size is also set to 10K, we've played with these values , but the
issue is persistent.
Can you elaborate as to what issues these are, and what exactly takes place
A step by step deconstruction of the problem would really help me in
understanding what's going on.
On Fri, Oct 4, 2013 at 11:57 AM, Hari Shreedharan <[EMAIL PROTECTED]
> Are you using the Memory Channel on the agents? We do know that there
> might be some issues when the memory channel is used when the heap is
> pretty large. We are wokring to resolve it.
> On Friday, October 4, 2013 at 5:34 AM, Bhaskar V. Karambelkar wrote:
> We've a client JVM process which uses flume client SDK
> (NettyAvroRPCClient) to push events to a flume source which ultimately
> lands in HDFS.
> On production we're still on flume 1.3, and one thing we find consistently
> is that under heavy load, the client JVM hangs. We've narrowed it down to
> the Flume client SDK,
> From what I suspect a long GC pause in flume agent, causes disconnects in
> avro client, which can lead to client JVM hangs.
> We're getting event's at the rate of about 25,000/sec, which are
> distributed across 8 clients, and they in turn forward them to 24 flume
> sources ( 6 boxes with 4 sources each). and each source writes to HDFS
> (i.e. 24 HDFS sinks as well).
> I tried switching flume agents GC to G1, which sort of helped, earlier the
> client JVM hangs were about 5 mins apart, now it's about 10 mins, so there
> is some progress.
> Question is how to completely eliminate these hangs. The hang is so bad, I
> can't even get the JVM to do a thread dump, so possible way for me to
> investigate what caused the JVM to hang.
> Could upgrading to 1.4, and using Thrift source help ?