Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Lock contention in FileChannel


+
Pankaj Gupta 2013-08-13, 23:13
+
Hari Shreedharan 2013-08-13, 23:39
+
Pankaj Gupta 2013-08-14, 00:01
+
Hari Shreedharan 2013-08-14, 00:14
+
Brock Noland 2013-08-14, 00:51
+
Pankaj Gupta 2013-08-14, 02:06
+
Hari Shreedharan 2013-08-14, 02:18
+
Brock Noland 2013-08-14, 02:22
+
Pankaj Gupta 2013-08-14, 02:33
+
Brock Noland 2013-08-14, 02:41
+
Pankaj Gupta 2013-08-14, 02:46
+
Brock Noland 2013-08-14, 02:54
+
Pankaj Gupta 2013-08-14, 02:57
Copy link to this message
-
Re: Lock contention in FileChannel
dataDirs is a comma separated list. Try 3-4 directories and then the same
test.
On Aug 13, 2013 9:58 PM, "Pankaj Gupta" <[EMAIL PROTECTED]> wrote:

> Both disks were at around 15-25%.
>
>
> On Tue, Aug 13, 2013 at 7:54 PM, Brock Noland <[EMAIL PROTECTED]> wrote:
>
>> Gotcha. When you run tge test what is tye disk utilization percentage?
>> Iostat can be used for this.
>> On Aug 13, 2013 9:47 PM, "Pankaj Gupta" <[EMAIL PROTECTED]> wrote:
>>
>>> Those are the boxes we want to collect data from. They run flume and
>>> send data through their avro sinks to the avro source on this box. We are
>>> getting data at a pretty good rate and the problem is in fact that the
>>> events don't drain from the FileChannel fast enough and the channel fill
>>> percentage keeps getting higher.
>>>
>>>
>>> On Tue, Aug 13, 2013 at 7:41 PM, Brock Noland <[EMAIL PROTECTED]>wrote:
>>>
>>>> What is sending the events to the avro source?
>>>> On Aug 13, 2013 9:34 PM, "Pankaj Gupta" <[EMAIL PROTECTED]> wrote:
>>>>
>>>>> Here's the config:
>>>>> # define channels, one for each disk
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> agent1.channels.ch1.type = FILE
>>>>> agent1.channels.ch1.checkpointDir = /flume1/checkpoint
>>>>> agent1.channels.ch1.dataDirs = /flume1/data
>>>>> agent1.channels.ch1.maxFileSize = 375809638400
>>>>> agent1.channels.ch1.capacity = 75000000
>>>>> agent1.channels.ch1.transactionCapacity = 4000
>>>>>
>>>>> agent1.channels.ch2.type = FILE
>>>>> agent1.channels.ch2.checkpointDir = /flume2/checkpoint
>>>>> agent1.channels.ch2.dataDirs = /flume2/data
>>>>> agent1.channels.ch2.maxFileSize = 375809638400
>>>>> agent1.channels.ch2.capacity = 75000000
>>>>> agent1.channels.ch2.transactionCapacity = 4000
>>>>>
>>>>>
>>>>>
>>>>> # Define an Avro source named avroSource1
>>>>> # Each sink can connect to only one channel.
>>>>> # Connect it to channel ch1. Load balance it to 2 avroSinks
>>>>>
>>>>>
>>>>> agent1.sources.avroSource1.channels = ch1
>>>>> agent1.sources.avroSource1.type = avro
>>>>> agent1.sources.avroSource1.bind = 0.0.0.0
>>>>> agent1.sources.avroSource1.port = <port>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> agent1.sinks.avroSink1-1-1.type = avro
>>>>> agent1.sinks.avroSink1-1-1.channel = ch1
>>>>> agent1.sinks.avroSink1-1-1.hostname = <hostname>
>>>>> agent1.sinks.avroSink1-1-1.port = <port>
>>>>> agent1.sinks.avroSink1-1-1.connect-timeout = 300000
>>>>> agent1.sinks.avroSink1-1-1.batch-size = 4000
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> agent1.sinks.avroSink1-2-1.type = avro
>>>>> agent1.sinks.avroSink1-2-1.channel = ch1
>>>>> agent1.sinks.avroSink1-2-1.hostname = <hostname>
>>>>> agent1.sinks.avroSink1-2-1.port = <port>
>>>>> agent1.sinks.avroSink1-2-1.connect-timeout = 300000
>>>>> agent1.sinks.avroSink1-2-1.batch-size = 4000
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> agent1.sinks.avroSink1-3-1.type = avro
>>>>> agent1.sinks.avroSink1-3-1.channel = ch1
>>>>> agent1.sinks.avroSink1-3-1.hostname = <hostname>
>>>>> agent1.sinks.avroSink1-3-1.port = <port>
>>>>> agent1.sinks.avroSink1-3-1.connect-timeout = 300000
>>>>> agent1.sinks.avroSink1-3-1.batch-size = 4000
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> agent1.sinks.avroSink1-4-1.type = avro
>>>>> agent1.sinks.avroSink1-4-1.channel = ch1
>>>>> agent1.sinks.avroSink1-4-1.hostname = <hostname>
>>>>> agent1.sinks.avroSink1-4-1.port = <port>
>>>>> agent1.sinks.avroSink1-4-1.connect-timeout = 300000
>>>>> agent1.sinks.avroSink1-4-1.batch-size = 4000
>>>>>
>>>>>
>>>>>
>>>>> #Add the sink groups; load-balance between each group of sinks which
>>>>> round robin between different hops
>>>>> agent1.sinkgroups.group1.sinks = avroSink1-1-1 avroSink1-2-1
>>>>> avroSink1-3-1 avroSink1-4-1
>>>>> agent1.sinkgroups.group1.processor.type = load_balance
>>>>> agent1.sinkgroups.group1.processor.selector = ROUND_ROBIN
>>>>> agent1.sinkgroups.group1.processor.backoff = true
>>>>>
>>>>>
>>>>> #End of set
>>>>>
>>>>> # Define an Avro source named avroSource2
>>>>> # Each sink can connect to only one channel.
>>>>> # Connect it to channel ch2. Load balance it to 2 avroSinks
+
Pankaj Gupta 2013-08-14, 03:16
+
Brock Noland 2013-08-14, 03:30
+
Pankaj Gupta 2013-08-14, 18:57
+
Pankaj Gupta 2013-08-14, 19:12
+
Pankaj Gupta 2013-08-14, 19:34
+
Hari Shreedharan 2013-08-14, 19:43
+
Pankaj Gupta 2013-08-14, 19:59
+
Pankaj Gupta 2013-08-15, 06:04
+
Pankaj Gupta 2013-08-18, 04:43
+
Hari Shreedharan 2013-08-14, 19:04
+
Pankaj Gupta 2013-08-14, 02:16