|
|
-
IPC ClosedChannelException error
Simon Gilliot 2012-03-26, 14:01
Hello,
we have a small architecture of 4 servers with 1 Namenode/Jobtracker/HbaseMaster, 2 Datanode/Tasktracker, 1 server Failover Namenode/Jobtracker. We often Hbase crashes with this error:
DEBUG org.apache.hadoop.ipc.HBaseServer: got #385 2012-03-26 15:15:08,690 DEBUG org.apache.hadoop.ipc.HBaseServer: IPC Server handler 3 on 48895: has #385 from 172.16.0.1:49493 2012-03-26 15:15:08,690 DEBUG org.apache.hadoop.ipc.HBaseServer: Served: regionServerReport queueTime= 0 procesingTime= 0 2012-03-26 15:15:08,691 DEBUG org.apache.hadoop.ipc.HBaseServer: IPC Server Responder: responding to #385 from 172.16.0.1:49493 2012-03-26 15:15:08,691 DEBUG org.apache.hadoop.ipc.HBaseServer: IPC Server Responder: responding to #385 from 172.16.0.1:49493 Wrote 8 bytes. 2012-03-26 15:15:08,691 DEBUG org.apache.hadoop.ipc.HBaseClient: IPC Client (47) connection to nm1.pus2011.com/172.16.0.1:48895 from an unknown user got value #385 2012-03-26 15:15:08,691 DEBUG org.apache.hadoop.ipc.HbaseRPC: Call: regionServerReport 2 2012-03-26 15:15:08,941 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to nm.pus2011.com/172.16.0.3:9000 from hbase: closed 2012-03-26 15:15:08,941 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to nm.pus2011.com/172.16.0.3:9000 from hbase: stopped, remaining connections 0 2012-03-26 15:15:11,043 DEBUG org.apache.hadoop.ipc.HBaseServer: Served: next queueTime= 41371 procesingTime= 43542 2012-03-26 15:15:11,043 DEBUG org.apache.hadoop.ipc.HBaseServer: IPC Server Responder: responding to #7 from 172.16.0.5:38633 2012-03-26 15:15:11,043 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server Responder, call next(-8649074184087149864, 1) from 172.16.0.5:38633: output error 2012-03-26 15:15:11,043 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 7 on 58237 caught: java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:249) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:440) at org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1341) at org.apache.hadoop.hbase.ipc.HBaseServer $Responder.processResponse(HBaseServer.java:727) at org.apache.hadoop.hbase.ipc.HBaseServer $Responder.doRespond(HBaseServer.java:792) at org.apache.hadoop.hbase.ipc.HBaseServer $Handler.run(HBaseServer.java:1083) Can you help us ?
Best regards
Simon Gilliot
-
Re: IPC ClosedChannelException error
Stack 2012-03-26, 15:03
On Mon, Mar 26, 2012 at 7:01 AM, Simon Gilliot <[EMAIL PROTECTED]> wrote: > Hello, > > we have a small architecture of 4 servers with 1 > Namenode/Jobtracker/HbaseMaster, 2 Datanode/Tasktracker, 1 server > Failover Namenode/Jobtracker. > We often Hbase crashes with this error: >
> 2012-03-26 15:15:11,043 WARN org.apache.hadoop.ipc.HBaseServer: > IPC Server handler 7 on 58237 caught: > java.nio.channels.ClosedChannelException
That doesn't look like a crash. It looks like client was gone -- shut down the socket -- when we went to respond. Do you see a client timeout previous on client-side? If server is crashing, maybe later logs show why.
St.Ack
-
Re: IPC ClosedChannelException error
Simon Gilliot 2012-03-26, 15:30
Thanks for your answer,
That's right, server does'nt really crash but the client (a map/reduce task attemp) fail on this.
On the server, there are other errors which appears after that :
12/03/26 15:18:01 INFO mapred.JobClient: Task Id : attempt_201203261505_0002_m_000051_0, Status : FAILED org.apache.hadoop.hbase.regionserver.LeaseException: org.apache.hadoop.hbase.regionserver.LeaseException: lease '-7139822043219672768' does not exist at org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:230) at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1879) at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.hbase.ipc.HBaseRPC $Server.call(HBaseRPC.java:570) at org.apache.hadoop.hbase.ipc.HBaseServer $Handler.run(HBaseServer.java:1039) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:525) I think each error after that was originaletly caused by the connexion closed. This error appears before the IPC timeout (which has been set *10). On the client, I don't see anything weird until the closed connexion. Any idea on what can cause that?
Simon Gilliot
Le lundi 26 mars 2012 à 08:03 -0700, Stack a écrit :
> On Mon, Mar 26, 2012 at 7:01 AM, Simon Gilliot > <[EMAIL PROTECTED]> wrote: > > Hello, > > > > we have a small architecture of 4 servers with 1 > > Namenode/Jobtracker/HbaseMaster, 2 Datanode/Tasktracker, 1 server > > Failover Namenode/Jobtracker. > > We often Hbase crashes with this error: > > > > > 2012-03-26 15:15:11,043 WARN org.apache.hadoop.ipc.HBaseServer: > > IPC Server handler 7 on 58237 caught: > > java.nio.channels.ClosedChannelException > > That doesn't look like a crash. It looks like client was gone -- shut > down the socket -- when we went to respond. Do you see a client > timeout previous on client-side? If server is crashing, maybe later > logs show why. > > St.Ack
-
Re: IPC ClosedChannelException error
Stack 2012-03-26, 16:18
On Mon, Mar 26, 2012 at 8:30 AM, Simon Gilliot <[EMAIL PROTECTED]> wrote: > I think each error after that was originaletly caused by the connexion > closed. This error appears before the IPC timeout (which has been set > *10). On the client, I don't see anything weird until the closed > connexion. Any idea on what can cause that? > Is it a case of '10.8.1.1. Scan Caching in MapReduce Jobs' in this section: http://hbase.apache.org/book.html#perf.reading?St.Ack
|
|