Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Lock contention in FileChannel


Copy link to this message
-
Re: Lock contention in FileChannel
It seems like some i/o is done inside the lock, which means that time for
taking a lock is proportional to the time for i/o and thus it becomes a
problem. I apologize in advance if I am wrong but the call stack and
behavior I'm seeing seems to suggest that. Specifically, it seems that we
do a write while inside take:
"SinkRunner-PollingRunner-LoadBalancingSinkProcessor" prio=10
tid=0x00007f857338c800 nid=0x404a runnable [0x00007f821b2f1000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.NativeThread.current(Native Method)
        at sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:27)
        at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:194)
        - locked <0x00000005190ec998> (a java.lang.Object)
        at
org.apache.flume.channel.file.LogFile$Writer.write(LogFile.java:247)
        at
org.apache.flume.channel.file.LogFile$Writer.take(LogFile.java:212)
        - locked <0x0000000519111590> (a
org.apache.flume.channel.file.LogFileV3$Writer)
        at org.apache.flume.channel.file.Log.take(Log.java:550)
        at
org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:499)
        at
org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)
        at
org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:95)
        at
org.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:330)
        at
org.apache.flume.sink.LoadBalancingSinkProcessor.process(LoadBalancingSinkProcessor.java:154)
        at
org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
        at java.lang.Thread.run(Thread.java:662)

On Tue, Aug 13, 2013 at 4:39 PM, Hari Shreedharan <[EMAIL PROTECTED]
> wrote:

> Since the channel is designed to make sure that events are not duplicated
> to multiple sinks, and to protect against corruption due to concurrency
> issues, we do not need the locking in the channel's flume event queue. It
> is unlikely that locking is what is causing performance issues because the
> channel is heavily I/O bound. If you take a series of thread dumps, you
> will probably see that those threads are moving forward and the ones
> reading/writing from/to disk are the ones which are slower. These locks are
> unlikely to hit performance much.
>
> Thanks,
> Hari
>
> On Tuesday, August 13, 2013 at 4:13 PM, Pankaj Gupta wrote:
>
> Hi,
>
> Spent some more time debugging issues with FileChannel. The problem seems
> to lock contention reading from FlumeEventQueue:
>
> I see a lot of threads like this:
> "SinkRunner-PollingRunner-LoadBalancingSinkProcessor" prio=10
> tid=0x00007f857b378800 nid=0x404d waiting for monitor entry
> [0x00007f821afee000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at
> org.apache.flume.channel.file.FlumeEventQueue.removeHead(FlumeEventQueue.java:117)
>         - waiting to lock <0x0000000518ee4c90> (a
> org.apache.flume.channel.file.FlumeEventQueue)
>         at
> org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:492)
>         at
> org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)
>         at
> org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:95)
>         at
> org.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:330)
>         at
> org.apache.flume.sink.LoadBalancingSinkProcessor.process(LoadBalancingSinkProcessor.java:154)
>         at
> org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>         at java.lang.Thread.run(Thread.java:662)
>
>
> I have two file channels and 8 Avro Sinks per file channel. I added more
> sinks because they weren't draining fast enough. It seems like they send
> the batch then wait for an ack before sending again, thus sends are not
> pipelined and having more sinks seemed like a good way of getting some
> parallelism.
>
> Here's the full stack trace:
> 2013-08-13 15:30:32
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.13-b02 mixed mode):
*P* | (415) 677-9222 ext. 205 *F *| (415) 677-0895 | [EMAIL PROTECTED]

Pankaj Gupta | Software Engineer

*BrightRoll, Inc. *| Smart Video Advertising | www.brightroll.com
United States | Canada | United Kingdom | Germany
We're hiring<http://newton.newtonsoftware.com/career/CareerHome.action?clientId=8a42a12b3580e2060135837631485aa7>
!
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB