Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Lock contention in FileChannel


+
Pankaj Gupta 2013-08-13, 23:13
+
Hari Shreedharan 2013-08-13, 23:39
+
Pankaj Gupta 2013-08-14, 00:01
+
Hari Shreedharan 2013-08-14, 00:14
+
Brock Noland 2013-08-14, 00:51
+
Pankaj Gupta 2013-08-14, 02:06
+
Hari Shreedharan 2013-08-14, 02:18
Copy link to this message
-
Re: Lock contention in FileChannel
Can you share your conf file?
On Aug 13, 2013 9:19 PM, "Hari Shreedharan" <[EMAIL PROTECTED]>
wrote:

>  Even though the writes are done per batch, they don't go to disk
> rightaway - commits are the only ones which actually cause an fsync - which
> is when writes actually go to disk.
>
>
> Thanks,
> Hari
>
> On Tuesday, August 13, 2013 at 7:06 PM, Pankaj Gupta wrote:
>
> Looking at the code it seems like the lock and the i/o is done per event
> and not for a batch. Is that correct? If that is the case then it seems
> like there is a lot of overhead per event. The throughput I'm seeing is 1 -
> 1.5 MBps per disk which is way below the sequential read/write capacity of
> the disk which is easily over 50MBps. Adding more sinks doesn't help, they
> just block waiting for the queue to become free. CPU usage is 20%, there is
> enough RAM for page cache so that no read is going to disk. The queue seems
> to be the bottleneck. What is the throughput I should expect per disk?
>
>
> On Tue, Aug 13, 2013 at 5:51 PM, Brock Noland <[EMAIL PROTECTED]> wrote:
>
> The lock is per file. Adding more directories to the channel will cause
> more files to be created. Of course you'll need additional disks behind
> those directories to see any performance increase.
>
>
> On Tue, Aug 13, 2013 at 7:14 PM, Hari Shreedharan <
> [EMAIL PROTECTED]> wrote:
>
>  Yes, IO is done inside locks to avoid multiple takes and puts getting
> written out at the same time. Even though Java makes sure the writes are
> serialized, Flume still needs to keep track of some counters etc, so the
> lock is required. Note that the lock you are talking about is  in the
> LogFile class, which represents a single file - so even if the write is
> inside that lock (which is also inside that class itself) that  does not
> cause any contention - because the lock is just preventing two IO ops to
> happen at the same time.
>
>
> Thanks,
> Hari
>
> On Tuesday, August 13, 2013 at 5:01 PM, Pankaj Gupta wrote:
>
> It seems like some i/o is done inside the lock, which means that time for
> taking a lock is proportional to the time for i/o and thus it becomes a
> problem. I apologize in advance if I am wrong but the call stack and
> behavior I'm seeing seems to suggest that. Specifically, it seems that we
> do a write while inside take:
> "SinkRunner-PollingRunner-LoadBalancingSinkProcessor" prio=10
> tid=0x00007f857338c800 nid=0x404a runnable [0x00007f821b2f1000]
>    java.lang.Thread.State: RUNNABLE
>         at sun.nio.ch.NativeThread.current(Native Method)
>         at sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:27)
>         at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:194)
>         - locked <0x00000005190ec998> (a java.lang.Object)
>         at
> org.apache.flume.channel.file.LogFile$Writer.write(LogFile.java:247)
>         at
> org.apache.flume.channel.file.LogFile$Writer.take(LogFile.java:212)
>         - locked <0x0000000519111590> (a
> org.apache.flume.channel.file.LogFileV3$Writer)
>         at org.apache.flume.channel.file.Log.take(Log.java:550)
>         at
> org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:499)
>         at
> org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)
>         at
> org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:95)
>         at
> org.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:330)
>         at
> org.apache.flume.sink.LoadBalancingSinkProcessor.process(LoadBalancingSinkProcessor.java:154)
>         at
> org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>         at java.lang.Thread.run(Thread.java:662)
>
>
>
> On Tue, Aug 13, 2013 at 4:39 PM, Hari Shreedharan <
> [EMAIL PROTECTED]> wrote:
>
> Since the channel is designed to make sure that events are not duplicated
> to multiple sinks, and to protect against corruption due to concurrency
> issues, we do not need the locking in the channel's flume event queue. It
+
Pankaj Gupta 2013-08-14, 02:33
+
Brock Noland 2013-08-14, 02:41
+
Pankaj Gupta 2013-08-14, 02:46
+
Brock Noland 2013-08-14, 02:54
+
Pankaj Gupta 2013-08-14, 02:57
+
Brock Noland 2013-08-14, 03:06
+
Pankaj Gupta 2013-08-14, 03:16
+
Brock Noland 2013-08-14, 03:30
+
Pankaj Gupta 2013-08-14, 18:57
+
Pankaj Gupta 2013-08-14, 19:12
+
Pankaj Gupta 2013-08-14, 19:34
+
Hari Shreedharan 2013-08-14, 19:43
+
Pankaj Gupta 2013-08-14, 19:59
+
Pankaj Gupta 2013-08-15, 06:04
+
Hari Shreedharan 2013-08-14, 19:04
+
Pankaj Gupta 2013-08-14, 02:16
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB