Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Region server crashes


Copy link to this message
-
Region server crashes
Hi all,
We use hbase 0.9.2. We recently started to experience region servers
crashed under heavy load (2-3 different servers crashes eah load).
Seems like missing block in HDFS causes a full GC and regions are being
closed.

Following logs sample from the region server (gc log, region server log)
and data node log.

gc.log:

82619.081: [Full GC 82619.081: [CMS: 7973415K->7973415K(8005248K),
8.7188750 secs] 8304143K->8304130K(8350272K), [CMS Perm :
20159K->20153K(33976K)] icms_dc=100 , 8.7189890 secs] [Times: user=8.71
sys=0.00, real=8.72 secs]
82627.801: [Full GC 82627.801: [CMS: 7973415K->7973414K(8005248K),
12.2467710 secs] 8305494K->8304129K(8350272K), [CMS Perm :
20153K->20153K(33976K)] icms_dc=100 , 12.2468820 secs] [Times: user=12.24
sys=0.00, real=12.25 secs]
82640.048: [Full GC 82640.048: [CMS: 7973414K->7973414K(8005248K),
8.3197090 secs] 8304129K->8304129K(8350272K), [CMS Perm :
20153K->20153K(33976K)] icms_dc=100 , 8.3198190 secs] [Times: user=8.32
sys=0.00, real=8.32 secs]
82648.369: [Full GC 82648.369: [CMS: 7973414K->7973414K(8005248K),
8.2264360 secs] 8304237K->8304130K(8350272K), [CMS Perm :
20153K->20153K(33976K)] icms_dc=100 , 8.2265410 secs] [Times: user=8.22
sys=0.01, real=8.22 secs]
82656.596: [Full GC 82656.596: [CMS: 7973414K->7973414K(8005248K),
8.4928260 secs] 8304130K->8304130K(8350272K), [CMS Perm :
20153K->20153K(33976K)] icms_dc=100 , 8.4929270 secs] [Times: user=8.48
sys=0.01, real=8.49 secs]
82665.089: [Full GC 82665.089: [CMS: 7973414K->7973414K(8005248K),
8.3110610 secs] 8304132K->8304130K(8350272K), [CMS Perm :
20153K->20153K(33976K)] icms_dc=100 , 8.3111690 secs] [Times: user=8.29
sys=0.02, real=8.31 secs]
82673.400: [Full GC 82673.400: [CMS: 7973414K->7973414K(8005248K),
8.5270500 secs] 8304130K->8304130K(8350272K), [CMS Perm :
20153K->20153K(33976K)] icms_dc=100 , 8.5271560 secs] [Times: user=8.52
sys=0.00, real=8.53 secs]
82681.929: [Full GC 82681.929: [CMS (concurrent mode failure):
7973414K->7973414K(8005248K), 12.0992320 secs]
8305391K->8304128K(8350272K), [CMS Perm : 20153K->20153K(33976K)]
icms_dc=100 , 12.0993380 secs] [Times: user=12.08 sys=0.03, real=12.10 secs]

region server log:

2012-03-25 07:25:28,729 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 8 on 8041 caught: java.nio.channels.ClosedChannelException
        at
sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
        at
org.apache.hadoop.hbase.ipc.HBaseServer.channelIO(HBaseServer.java:1387)
        at
org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1339)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:727)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:792)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1083)

2012-03-25 07:31:36,931 WARN org.apache.hadoop.hdfs.DFSClient:
DFSOutputStream ResponseProcessor exception  for block
blk_9110507610649672616_2064873java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:180)
        at java.io.DataInputStream.readLong(DataInputStream.java:399)
        at
org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:122)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2548)

org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.ipc.RemoteException: java.io.IOException:
blk_-6219291421501721811_2065530 is already commited, storedBlock == null.
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStampForBlock(FSNamesystem.java:4877)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.nextGenerationStamp(NameNode.java:501)
        at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:961)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:957)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:955)

        at org.apache.hadoop.ipc.Client.call(Client.java:740)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy4.nextGenerationStamp(Unknown Source)
        at
org.apache.hadoop.hdfs.server.datanode.DataNode.syncBlock(DataNode.java:1577)
        at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1551)
        at
org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:1617)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:961)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:957)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:955)

        at org.apache.hadoop.ipc.Client.call(Client.java:740)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy9.recoverBlock(Unknown Source)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2706)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1500(DFS
+
Jean-Daniel Cryans 2012-03-26, 17:43
+
Lior Schachter 2012-03-27, 09:54
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB