Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Lock contention in FileChannel


+
Pankaj Gupta 2013-08-13, 23:13
+
Hari Shreedharan 2013-08-13, 23:39
+
Pankaj Gupta 2013-08-14, 00:01
+
Hari Shreedharan 2013-08-14, 00:14
+
Brock Noland 2013-08-14, 00:51
+
Pankaj Gupta 2013-08-14, 02:06
+
Hari Shreedharan 2013-08-14, 02:18
+
Brock Noland 2013-08-14, 02:22
+
Pankaj Gupta 2013-08-14, 02:33
+
Brock Noland 2013-08-14, 02:41
+
Pankaj Gupta 2013-08-14, 02:46
Copy link to this message
-
Re: Lock contention in FileChannel
Gotcha. When you run tge test what is tye disk utilization percentage?
Iostat can be used for this.
On Aug 13, 2013 9:47 PM, "Pankaj Gupta" <[EMAIL PROTECTED]> wrote:

> Those are the boxes we want to collect data from. They run flume and send
> data through their avro sinks to the avro source on this box. We are
> getting data at a pretty good rate and the problem is in fact that the
> events don't drain from the FileChannel fast enough and the channel fill
> percentage keeps getting higher.
>
>
> On Tue, Aug 13, 2013 at 7:41 PM, Brock Noland <[EMAIL PROTECTED]> wrote:
>
>> What is sending the events to the avro source?
>> On Aug 13, 2013 9:34 PM, "Pankaj Gupta" <[EMAIL PROTECTED]> wrote:
>>
>>> Here's the config:
>>> # define channels, one for each disk
>>>
>>>
>>>
>>>
>>> agent1.channels.ch1.type = FILE
>>> agent1.channels.ch1.checkpointDir = /flume1/checkpoint
>>> agent1.channels.ch1.dataDirs = /flume1/data
>>> agent1.channels.ch1.maxFileSize = 375809638400
>>> agent1.channels.ch1.capacity = 75000000
>>> agent1.channels.ch1.transactionCapacity = 4000
>>>
>>> agent1.channels.ch2.type = FILE
>>> agent1.channels.ch2.checkpointDir = /flume2/checkpoint
>>> agent1.channels.ch2.dataDirs = /flume2/data
>>> agent1.channels.ch2.maxFileSize = 375809638400
>>> agent1.channels.ch2.capacity = 75000000
>>> agent1.channels.ch2.transactionCapacity = 4000
>>>
>>>
>>>
>>> # Define an Avro source named avroSource1
>>> # Each sink can connect to only one channel.
>>> # Connect it to channel ch1. Load balance it to 2 avroSinks
>>>
>>>
>>> agent1.sources.avroSource1.channels = ch1
>>> agent1.sources.avroSource1.type = avro
>>> agent1.sources.avroSource1.bind = 0.0.0.0
>>> agent1.sources.avroSource1.port = <port>
>>>
>>>
>>>
>>>
>>> agent1.sinks.avroSink1-1-1.type = avro
>>> agent1.sinks.avroSink1-1-1.channel = ch1
>>> agent1.sinks.avroSink1-1-1.hostname = <hostname>
>>> agent1.sinks.avroSink1-1-1.port = <port>
>>> agent1.sinks.avroSink1-1-1.connect-timeout = 300000
>>> agent1.sinks.avroSink1-1-1.batch-size = 4000
>>>
>>>
>>>
>>>
>>> agent1.sinks.avroSink1-2-1.type = avro
>>> agent1.sinks.avroSink1-2-1.channel = ch1
>>> agent1.sinks.avroSink1-2-1.hostname = <hostname>
>>> agent1.sinks.avroSink1-2-1.port = <port>
>>> agent1.sinks.avroSink1-2-1.connect-timeout = 300000
>>> agent1.sinks.avroSink1-2-1.batch-size = 4000
>>>
>>>
>>>
>>>
>>> agent1.sinks.avroSink1-3-1.type = avro
>>> agent1.sinks.avroSink1-3-1.channel = ch1
>>> agent1.sinks.avroSink1-3-1.hostname = <hostname>
>>> agent1.sinks.avroSink1-3-1.port = <port>
>>> agent1.sinks.avroSink1-3-1.connect-timeout = 300000
>>> agent1.sinks.avroSink1-3-1.batch-size = 4000
>>>
>>>
>>>
>>>
>>> agent1.sinks.avroSink1-4-1.type = avro
>>> agent1.sinks.avroSink1-4-1.channel = ch1
>>> agent1.sinks.avroSink1-4-1.hostname = <hostname>
>>> agent1.sinks.avroSink1-4-1.port = <port>
>>> agent1.sinks.avroSink1-4-1.connect-timeout = 300000
>>> agent1.sinks.avroSink1-4-1.batch-size = 4000
>>>
>>>
>>>
>>> #Add the sink groups; load-balance between each group of sinks which
>>> round robin between different hops
>>> agent1.sinkgroups.group1.sinks = avroSink1-1-1 avroSink1-2-1
>>> avroSink1-3-1 avroSink1-4-1
>>> agent1.sinkgroups.group1.processor.type = load_balance
>>> agent1.sinkgroups.group1.processor.selector = ROUND_ROBIN
>>> agent1.sinkgroups.group1.processor.backoff = true
>>>
>>>
>>> #End of set
>>>
>>> # Define an Avro source named avroSource2
>>> # Each sink can connect to only one channel.
>>> # Connect it to channel ch2. Load balance it to 2 avroSinks
>>>
>>>
>>> agent1.sources.avroSource2.channels = ch2
>>> agent1.sources.avroSource2.type = avro
>>> agent1.sources.avroSource2.bind = 0.0.0.0
>>> agent1.sources.avroSource2.port = <port>
>>>
>>>
>>>
>>>
>>> agent1.sinks.avroSink2-1-1.type = avro
>>> agent1.sinks.avroSink2-1-1.channel = ch2
>>> agent1.sinks.avroSink2-1-1.hostname = <hostname>
>>> agent1.sinks.avroSink2-1-1.port = <port>
>>> agent1.sinks.avroSink2-1-1.connect-timeout = 300000
>>> agent1.sinks.avroSink2-1-1.batch-size = 4000
+
Pankaj Gupta 2013-08-14, 02:57
+
Brock Noland 2013-08-14, 03:06
+
Pankaj Gupta 2013-08-14, 03:16
+
Brock Noland 2013-08-14, 03:30
+
Pankaj Gupta 2013-08-14, 18:57
+
Pankaj Gupta 2013-08-14, 19:12
+
Pankaj Gupta 2013-08-14, 19:34
+
Hari Shreedharan 2013-08-14, 19:43
+
Pankaj Gupta 2013-08-14, 19:59
+
Pankaj Gupta 2013-08-15, 06:04
+
Hari Shreedharan 2013-08-14, 19:04
+
Pankaj Gupta 2013-08-14, 02:16
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB