Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Netty/Avro IPC problem: channel closed


Copy link to this message
-
Re: Netty/Avro IPC problem: channel closed
It appears to be pre-1.5.2 from this part of the stack trace:

       at java.util.concurrent.Semaphore.acquire(Semaphore.java:313)
       at
org.apache.avro.ipc.NettyTransceiver$CallFuture.get(NettyTransceiver.java:203)

CallFuture was moved out of NettyTransceiver as part of AVRO-539 and is now
a stand-alone class.  Also the Semaphore inside CallFuture was replaced with
a CountDownLatch, so in 1.5.2 and later we should never see CallFuture
waiting on a Semaphore.

>From your initial description it appears that some temporary network
disruption might have caused the connection between the client and server to
close, and then the client never recovered from this situation.  This
doesn't surprise me because I don't think the pre-1.5.2 NettyTransceiver had
any way to recover from a connection failure.  While working on AVRO-539 I
modified the transceiver code such that it would attempt to re-establish the
connection if the connection was lost, so that's why I think this may help
you.  Just a guess though.  But like I said, since the code has changed so
much in 1.5.2 and later, it will be much easier to figure out what's wrong
(and fix it if necessary) if you can reproduce it using 1.5.2 or later.

-James
On Wed, Sep 14, 2011 at 1:39 PM, Yang <[EMAIL PROTECTED]> wrote:

> thanks James:
>
> I *think* I'm using 1.5.2, but could check to be sure.
> how do you determine that it is a pre-1.5.2 version?
>
> Yang
>
> On Wed, Sep 14, 2011 at 10:25 AM, James Baldassari
> <[EMAIL PROTECTED]> wrote:
> > Hi Yang,
> >
> > From the stack trace you posted it appears that you are using a version
> of
> > Avro prior to 1.5.2.  Which version are you using?  There have been a
> number
> > of significant changes recently to the RPC framework and the Netty
> > implementation in particular.  Could you please try to reproduce the
> problem
> > using Avro 1.5.2 or newer?  The problem may be resolved with an upgrade.
> If
> > the problem still exists in the newer versions, it will be a lot easier
> to
> > diagnose/fix it if we can see stack traces from a post-1.5.2 version.
> >
> > Thanks,
> > James
> >
> >
> > On Wed, Sep 14, 2011 at 1:08 PM, Yang <[EMAIL PROTECTED]> wrote:
> >>
> >> I'm always seeing these "channel closed " exceptions , with low
> >> probability, i.e. about every 10 hours under heavy load.
> >>
> >> I'm not sure if it's the server that got the channel closed or the
> >> client, so I included the exception stack from both sides.
> >> anybody has an idea how to debug this?
> >>
> >> also, let's say it does have a valid reason for closing this, what is
> >> my strategy of coping with this? I originally have many
> >> senders, due to the channel close exception, many of them died, after
> >> this, only 2 application threads remain, but they
> >> all seem blocked on trying to grab a connection from Netty's pool, so
> >> even if I create new sender threads, it seems they would still
> >> block. so how can I tell netty to "reset/replenish " its connections?
> >>
> >>
> >> Thanks a lot
> >> Yang
> >>
> >>
> >> client side:
> >>
> >>
> >>
> >>  WARN 16:51:02,079 Unexpected exception from downstream.
> >> java.nio.channels.ClosedChannelException
> >>        at
> >>
> org.jboss.netty.channel.socket.nio.NioWorker.cleanUpWriteBuffer(NioWorker.java:636)
> >>        at
> >>
> org.jboss.netty.channel.socket.nio.NioWorker.writeFromUserCode(NioWorker.java:369)
> >>        at
> >>
> org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:117)
> >>        at org.jboss.netty.channel.Channels.write(Channels.java:632)
> >>        at
> >>
> org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:70)
> >>        at org.jboss.netty.channel.Channels.write(Channels.java:611)
> >>        at org.jboss.netty.channel.Channels.write(Channels.java:578)
> >>        at
> >> org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:259)
> >>        at
> >>
> org.apache.avro.ipc.NettyTransceiver.transceive(NettyTransceiver.java:131)