Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> ExecSource->MemoryChannel->AvroSink->AvroSource->FileChannel->HDFSSink throughput question


Copy link to this message
-
Re: ExecSource->MemoryChannel->AvroSink->AvroSource->FileChannel->HDFSSink throughput question
Perfect.
Again, thank you so much for your time. :)
The timeout increase bought be some time, but it still ended up with the
Exception.  I love the multiple sinks idea...I should have thought of that
:)

Chris
On Mon, Feb 4, 2013 at 8:22 PM, Juhani Connolly <
[EMAIL PROTECTED]> wrote:

>  Hey
>
>
> On 02/02/2013 01:40 AM, Chris Neal wrote:
>
> Thanks for the help Juhani :)  I'll take a look with Ganglia and see what
> things look like.
>
>  Any thoughts on keeping the ExecSource.batchSize,
> MemoryChannel.transactionCapacity, AvroSink.batch-size, and
> HDFSSink.batchSize the same?
>
>   It's not really important, so long as the avro batch size is less than
> or equal to the channel transaction capacity. The HDFS sinks batch size is
> independent of them both.
>
>
>   I looked at the MemoryChannel code, and noticed that there is a timeout
> parameter passed to doCommit(), where the execption is being thrown.  Just
> for fun, I increased it from the default to 10 seconds, and now things are
> running smoothly with the same config as before.  It's been running for
> about 24 hours now.  A step in the right direction anyway! :)
>
>
> If that fixed it, it sounds like your data is just very bursty and
> sometimes gets fed in faster than it's drained out. The solution to that
> would be either to enlarge your temporary buffer(the mem channel), to
> throttle the incoming data(probably not possible) or to increase drain
> speed(more sinks running in parallel)
>
>
>  Thanks again.
> Chris
>
> On Thu, Jan 31, 2013 at 8:12 PM, Juhani Connolly <
> [EMAIL PROTECTED]> wrote:
>
>>  Hi Chris,
>>
>> The most likely cause of that error is that the sinks are draining
>> requests slower than your sources are feeding fresh data. Over time it will
>> fill up the capacity of your memory channel, which will then start refusing
>> additional put requests.
>>
>> You can confirm this by connecting with jmx or ganglia.
>>
>> If the write is extremely bursty, it's possible that it's just
>> temporarily going over the sink consumption rate, and increasing the
>> channel capacity could work. Otherwise, increasing the avro batch size, or
>> adding additional avro sinks(more threads) may also help. I think that
>> setting up ganglia monitoring and looking at the incoming and outgoing
>> event counts and channel fill states helps a lot in diagnosing these
>> bottlenecks, you should look into doing that.
>>
>>
>> On 02/01/2013 02:01 AM, Chris Neal wrote:
>>
>> Hi all.
>>
>>  I need some thoughts on sizing/tuning of the above (common) route in
>> FlumeNG to maximize throughput.  Here is my setup:
>>
>>  *Source JVM (ExecSource/MemoryChannel/AvroSink):*
>> -Xmx4g
>> -Xms4g
>> -XX:MaxDirectMemorySize=256m
>>
>>  Number of ExecSources in config:  124 (yes, it's a ton.  Can't do
>> anything about it :)  The write rate to the source files is fairly fast and
>> bursty.
>>
>>  ExecSource.batchSize = 1000
>> (so, when all 124 tail -F instances get 1000 events, they all dump to the
>> memory channel)
>>
>>  MemoryChannel.capacity = 1000000
>> MemoryChannel.transactionCapacity = 1000
>> (somewhat unclear on what this is.  Docs say "The number of events stored
>> in the channel per transaction", but what is a "transaction" to a
>> MemoryChannel?)
>>
>>  AvroSink.batchSize = 1000
>>
>>  *Destination JVM (AvroSource/FileChannel/HDFSSink)*
>> (Cluster of two JVMs on two servers, each configured the same as per
>> below)
>> -Xms=2g
>> -Xmx=2g
>> -XX:MaxDirectMemorySize is not defined, so whatever the default is
>>
>>  AvroSource.threads = 64
>> FileChannel.transactionCapacity = 1000
>> FileChannel.capacity = 32000000
>> HDFSSink.batchSize = 1000
>> HDFSSink.threadPoolSize = 64
>>
>>  With this configuration, in about 5 minutes, I get the common Exception:
>>
>>  "Space for commit to queue couldn't be acquired Sinks are likely not
>> keeping up with sources, or the buffer size is too tight"
>>
>>  on the Source JVM.  It is no where near the 4g max, rather only at