Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Lock contention in FileChannel

Copy link to this message
Re: Lock contention in FileChannel
It seems like some i/o is done inside the lock, which means that time for
taking a lock is proportional to the time for i/o and thus it becomes a
problem. I apologize in advance if I am wrong but the call stack and
behavior I'm seeing seems to suggest that. Specifically, it seems that we
do a write while inside take:
"SinkRunner-PollingRunner-LoadBalancingSinkProcessor" prio=10
tid=0x00007f857338c800 nid=0x404a runnable [0x00007f821b2f1000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.NativeThread.current(Native Method)
        at sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:27)
        at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:194)
        - locked <0x00000005190ec998> (a java.lang.Object)
        - locked <0x0000000519111590> (a
        at org.apache.flume.channel.file.Log.take(Log.java:550)
        at java.lang.Thread.run(Thread.java:662)

On Tue, Aug 13, 2013 at 4:39 PM, Hari Shreedharan <[EMAIL PROTECTED]
> wrote:

> Since the channel is designed to make sure that events are not duplicated
> to multiple sinks, and to protect against corruption due to concurrency
> issues, we do not need the locking in the channel's flume event queue. It
> is unlikely that locking is what is causing performance issues because the
> channel is heavily I/O bound. If you take a series of thread dumps, you
> will probably see that those threads are moving forward and the ones
> reading/writing from/to disk are the ones which are slower. These locks are
> unlikely to hit performance much.
> Thanks,
> Hari
> On Tuesday, August 13, 2013 at 4:13 PM, Pankaj Gupta wrote:
> Hi,
> Spent some more time debugging issues with FileChannel. The problem seems
> to lock contention reading from FlumeEventQueue:
> I see a lot of threads like this:
> "SinkRunner-PollingRunner-LoadBalancingSinkProcessor" prio=10
> tid=0x00007f857b378800 nid=0x404d waiting for monitor entry
> [0x00007f821afee000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at
> org.apache.flume.channel.file.FlumeEventQueue.removeHead(FlumeEventQueue.java:117)
>         - waiting to lock <0x0000000518ee4c90> (a
> org.apache.flume.channel.file.FlumeEventQueue)
>         at
> org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:492)
>         at
> org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)
>         at
> org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:95)
>         at
> org.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:330)
>         at
> org.apache.flume.sink.LoadBalancingSinkProcessor.process(LoadBalancingSinkProcessor.java:154)
>         at
> org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>         at java.lang.Thread.run(Thread.java:662)
> I have two file channels and 8 Avro Sinks per file channel. I added more
> sinks because they weren't draining fast enough. It seems like they send
> the batch then wait for an ack before sending again, thus sends are not
> pipelined and having more sinks seemed like a good way of getting some
> parallelism.
> Here's the full stack trace:
> 2013-08-13 15:30:32
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.13-b02 mixed mode):
*P* | (415) 677-9222 ext. 205 *F *| (415) 677-0895 | [EMAIL PROTECTED]

Pankaj Gupta | Software Engineer

*BrightRoll, Inc. *| Smart Video Advertising | www.brightroll.com
United States | Canada | United Kingdom | Germany
We're hiring<http://newton.newtonsoftware.com/career/CareerHome.action?clientId=8a42a12b3580e2060135837631485aa7>