Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - Lock contention in FileChannel


Copy link to this message
-
Re: Lock contention in FileChannel
Pankaj Gupta 2013-08-14, 03:16
I did try increasing number of FileChannels. At 2 FileChannels per disk
performance seemed to be 25% better. At 4 FileChannels per disk performance
dropped to even below 1 FileChannel per disk. I will try increasing the
dataDirs tomorrow.
On Tue, Aug 13, 2013 at 8:06 PM, Brock Noland <[EMAIL PROTECTED]> wrote:

> dataDirs is a comma separated list. Try 3-4 directories and then the same
> test.
> On Aug 13, 2013 9:58 PM, "Pankaj Gupta" <[EMAIL PROTECTED]> wrote:
>
>> Both disks were at around 15-25%.
>>
>>
>> On Tue, Aug 13, 2013 at 7:54 PM, Brock Noland <[EMAIL PROTECTED]> wrote:
>>
>>> Gotcha. When you run tge test what is tye disk utilization percentage?
>>> Iostat can be used for this.
>>> On Aug 13, 2013 9:47 PM, "Pankaj Gupta" <[EMAIL PROTECTED]> wrote:
>>>
>>>> Those are the boxes we want to collect data from. They run flume and
>>>> send data through their avro sinks to the avro source on this box. We are
>>>> getting data at a pretty good rate and the problem is in fact that the
>>>> events don't drain from the FileChannel fast enough and the channel fill
>>>> percentage keeps getting higher.
>>>>
>>>>
>>>> On Tue, Aug 13, 2013 at 7:41 PM, Brock Noland <[EMAIL PROTECTED]>wrote:
>>>>
>>>>> What is sending the events to the avro source?
>>>>> On Aug 13, 2013 9:34 PM, "Pankaj Gupta" <[EMAIL PROTECTED]> wrote:
>>>>>
>>>>>> Here's the config:
>>>>>> # define channels, one for each disk
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> agent1.channels.ch1.type = FILE
>>>>>> agent1.channels.ch1.checkpointDir = /flume1/checkpoint
>>>>>> agent1.channels.ch1.dataDirs = /flume1/data
>>>>>> agent1.channels.ch1.maxFileSize = 375809638400
>>>>>> agent1.channels.ch1.capacity = 75000000
>>>>>> agent1.channels.ch1.transactionCapacity = 4000
>>>>>>
>>>>>> agent1.channels.ch2.type = FILE
>>>>>> agent1.channels.ch2.checkpointDir = /flume2/checkpoint
>>>>>> agent1.channels.ch2.dataDirs = /flume2/data
>>>>>> agent1.channels.ch2.maxFileSize = 375809638400
>>>>>> agent1.channels.ch2.capacity = 75000000
>>>>>> agent1.channels.ch2.transactionCapacity = 4000
>>>>>>
>>>>>>
>>>>>>
>>>>>> # Define an Avro source named avroSource1
>>>>>> # Each sink can connect to only one channel.
>>>>>> # Connect it to channel ch1. Load balance it to 2 avroSinks
>>>>>>
>>>>>>
>>>>>> agent1.sources.avroSource1.channels = ch1
>>>>>> agent1.sources.avroSource1.type = avro
>>>>>> agent1.sources.avroSource1.bind = 0.0.0.0
>>>>>> agent1.sources.avroSource1.port = <port>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> agent1.sinks.avroSink1-1-1.type = avro
>>>>>> agent1.sinks.avroSink1-1-1.channel = ch1
>>>>>> agent1.sinks.avroSink1-1-1.hostname = <hostname>
>>>>>> agent1.sinks.avroSink1-1-1.port = <port>
>>>>>> agent1.sinks.avroSink1-1-1.connect-timeout = 300000
>>>>>> agent1.sinks.avroSink1-1-1.batch-size = 4000
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> agent1.sinks.avroSink1-2-1.type = avro
>>>>>> agent1.sinks.avroSink1-2-1.channel = ch1
>>>>>> agent1.sinks.avroSink1-2-1.hostname = <hostname>
>>>>>> agent1.sinks.avroSink1-2-1.port = <port>
>>>>>> agent1.sinks.avroSink1-2-1.connect-timeout = 300000
>>>>>> agent1.sinks.avroSink1-2-1.batch-size = 4000
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> agent1.sinks.avroSink1-3-1.type = avro
>>>>>> agent1.sinks.avroSink1-3-1.channel = ch1
>>>>>> agent1.sinks.avroSink1-3-1.hostname = <hostname>
>>>>>> agent1.sinks.avroSink1-3-1.port = <port>
>>>>>> agent1.sinks.avroSink1-3-1.connect-timeout = 300000
>>>>>> agent1.sinks.avroSink1-3-1.batch-size = 4000
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> agent1.sinks.avroSink1-4-1.type = avro
>>>>>> agent1.sinks.avroSink1-4-1.channel = ch1
>>>>>> agent1.sinks.avroSink1-4-1.hostname = <hostname>
>>>>>> agent1.sinks.avroSink1-4-1.port = <port>
>>>>>> agent1.sinks.avroSink1-4-1.connect-timeout = 300000
>>>>>> agent1.sinks.avroSink1-4-1.batch-size = 4000
>>>>>>
>>>>>>
>>>>>>
>>>>>> #Add the sink groups; load-balance between each group of sinks which
>>>>>> round robin between different hops
>>>>>> agent1.sinkgroups.group1.sinks = avroSink1-1-1 avroSink1-2-1
*P* | (415) 677-9222 ext. 205 *F *| (415) 677-0895 | [EMAIL PROTECTED]

Pankaj Gupta | Software Engineer

*BrightRoll, Inc. *| Smart Video Advertising | www.brightroll.com
United States | Canada | United Kingdom | Germany
We're hiring<http://newton.newtonsoftware.com/career/CareerHome.action?clientId=8a42a12b3580e2060135837631485aa7>
!