Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Lock contention in FileChannel


+
Pankaj Gupta 2013-08-13, 23:13
+
Hari Shreedharan 2013-08-13, 23:39
Copy link to this message
-
Re: Lock contention in FileChannel
It seems like some i/o is done inside the lock, which means that time for
taking a lock is proportional to the time for i/o and thus it becomes a
problem. I apologize in advance if I am wrong but the call stack and
behavior I'm seeing seems to suggest that. Specifically, it seems that we
do a write while inside take:
"SinkRunner-PollingRunner-LoadBalancingSinkProcessor" prio=10
tid=0x00007f857338c800 nid=0x404a runnable [0x00007f821b2f1000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.NativeThread.current(Native Method)
        at sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:27)
        at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:194)
        - locked <0x00000005190ec998> (a java.lang.Object)
        at
org.apache.flume.channel.file.LogFile$Writer.write(LogFile.java:247)
        at
org.apache.flume.channel.file.LogFile$Writer.take(LogFile.java:212)
        - locked <0x0000000519111590> (a
org.apache.flume.channel.file.LogFileV3$Writer)
        at org.apache.flume.channel.file.Log.take(Log.java:550)
        at
org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:499)
        at
org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)
        at
org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:95)
        at
org.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:330)
        at
org.apache.flume.sink.LoadBalancingSinkProcessor.process(LoadBalancingSinkProcessor.java:154)
        at
org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
        at java.lang.Thread.run(Thread.java:662)

On Tue, Aug 13, 2013 at 4:39 PM, Hari Shreedharan <[EMAIL PROTECTED]
> wrote:

> Since the channel is designed to make sure that events are not duplicated
> to multiple sinks, and to protect against corruption due to concurrency
> issues, we do not need the locking in the channel's flume event queue. It
> is unlikely that locking is what is causing performance issues because the
> channel is heavily I/O bound. If you take a series of thread dumps, you
> will probably see that those threads are moving forward and the ones
> reading/writing from/to disk are the ones which are slower. These locks are
> unlikely to hit performance much.
>
> Thanks,
> Hari
>
> On Tuesday, August 13, 2013 at 4:13 PM, Pankaj Gupta wrote:
>
> Hi,
>
> Spent some more time debugging issues with FileChannel. The problem seems
> to lock contention reading from FlumeEventQueue:
>
> I see a lot of threads like this:
> "SinkRunner-PollingRunner-LoadBalancingSinkProcessor" prio=10
> tid=0x00007f857b378800 nid=0x404d waiting for monitor entry
> [0x00007f821afee000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at
> org.apache.flume.channel.file.FlumeEventQueue.removeHead(FlumeEventQueue.java:117)
>         - waiting to lock <0x0000000518ee4c90> (a
> org.apache.flume.channel.file.FlumeEventQueue)
>         at
> org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:492)
>         at
> org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)
>         at
> org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:95)
>         at
> org.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:330)
>         at
> org.apache.flume.sink.LoadBalancingSinkProcessor.process(LoadBalancingSinkProcessor.java:154)
>         at
> org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>         at java.lang.Thread.run(Thread.java:662)
>
>
> I have two file channels and 8 Avro Sinks per file channel. I added more
> sinks because they weren't draining fast enough. It seems like they send
> the batch then wait for an ack before sending again, thus sends are not
> pipelined and having more sinks seemed like a good way of getting some
> parallelism.
>
> Here's the full stack trace:
> 2013-08-13 15:30:32
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.13-b02 mixed mode):
*P* | (415) 677-9222 ext. 205 *F *| (415) 677-0895 | [EMAIL PROTECTED]

Pankaj Gupta | Software Engineer

*BrightRoll, Inc. *| Smart Video Advertising | www.brightroll.com
United States | Canada | United Kingdom | Germany
We're hiring<http://newton.newtonsoftware.com/career/CareerHome.action?clientId=8a42a12b3580e2060135837631485aa7>
!
+
Hari Shreedharan 2013-08-14, 00:14
+
Brock Noland 2013-08-14, 00:51
+
Pankaj Gupta 2013-08-14, 02:06
+
Hari Shreedharan 2013-08-14, 02:18
+
Brock Noland 2013-08-14, 02:22
+
Pankaj Gupta 2013-08-14, 02:33
+
Brock Noland 2013-08-14, 02:41
+
Pankaj Gupta 2013-08-14, 02:46
+
Brock Noland 2013-08-14, 02:54
+
Pankaj Gupta 2013-08-14, 02:57
+
Brock Noland 2013-08-14, 03:06
+
Pankaj Gupta 2013-08-14, 03:16
+
Brock Noland 2013-08-14, 03:30
+
Pankaj Gupta 2013-08-14, 18:57
+
Pankaj Gupta 2013-08-14, 19:12
+
Pankaj Gupta 2013-08-14, 19:34
+
Hari Shreedharan 2013-08-14, 19:43
+
Pankaj Gupta 2013-08-14, 19:59
+
Pankaj Gupta 2013-08-15, 06:04
+
Pankaj Gupta 2013-08-18, 04:43
+
Hari Shreedharan 2013-08-14, 19:04
+
Pankaj Gupta 2013-08-14, 02:16