Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - HMaster and HRegionServer going down


Copy link to this message
-
Re: HMaster and HRegionServer going down
Azuryy Yu 2013-06-05, 08:57
Because your data node 192.168.20.30 broke down. which leads to RS down.
On Wed, Jun 5, 2013 at 3:19 PM, Vimal Jain <[EMAIL PROTECTED]> wrote:

> Here is the complete log:
>
> http://bin.cakephp.org/saved/103001 - Hregion
> http://bin.cakephp.org/saved/103000 - Hmaster
> http://bin.cakephp.org/saved/103002 - Datanode
>
>
> On Wed, Jun 5, 2013 at 11:58 AM, Vimal Jain <[EMAIL PROTECTED]> wrote:
>
> > Hi,
> > I have set up Hbase in pseudo-distributed mode.
> > It was working fine for 6 days , but suddenly today morning both HMaster
> > and Hregion process went down.
> > I checked in logs of both hadoop and hbase.
> > Please help here.
> > Here are the snippets :-
> >
> > *Datanode logs:*
> > 2013-06-05 05:12:51,436 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in
> receiveBlock
> > for block blk_1597245478875608321_2818 java.io.EOFException: while trying
> > to read 2347 bytes
> > 2013-06-05 05:12:51,442 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
> > blk_1597245478875608321_2818 received exception java.io.EOFException:
> while
> > trying to read 2347 bytes
> > 2013-06-05 05:12:51,442 ERROR
> > org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
> > 192.168.20.30:50010,
> > storageID=DS-1816106352-192.168.20.30-50010-1369314076237,
> infoPort=50075,
> > ipcPort=50020):DataXceiver
> > java.io.EOFException: while trying to read 2347 bytes
> >
> >
> > *HRegion logs:*
> > 2013-06-05 05:12:50,701 WARN org.apache.hadoop.hbase.util.Sleeper: We
> > slept 4694929ms instead of 3000ms, this is likely due to a long garbage
> > collecting pause and it's usually bad, see
> > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> > 2013-06-05 05:12:51,045 WARN org.apache.hadoop.hdfs.DFSClient:
> > DFSOutputStream ResponseProcessor exception  for block
> > blk_1597245478875608321_2818java.net.SocketTimeoutException: 63000 millis
> > timeout while waiting for channel to be ready for read. ch :
> > java.nio.channels.SocketChannel[connected local=/192.168.20.30:44333
> remote=/
> > 192.168.20.30:50010]
> > 2013-06-05 05:12:51,046 WARN org.apache.hadoop.hbase.util.Sleeper: We
> > slept 11695345ms instead of 10000000ms, this is likely due to a long
> > garbage collecting pause and it's usually bad, see
> > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> > 2013-06-05 05:12:51,048 WARN org.apache.hadoop.hdfs.DFSClient: Error
> > Recovery for block blk_1597245478875608321_2818 bad datanode[0]
> > 192.168.20.30:50010
> > 2013-06-05 05:12:51,075 WARN org.apache.hadoop.hdfs.DFSClient: Error
> while
> > syncing
> > java.io.IOException: All datanodes 192.168.20.30:50010 are bad.
> > Aborting...
> >     at
> >
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3096)
> > 2013-06-05 05:12:51,110 FATAL
> > org.apache.hadoop.hbase.regionserver.wal.HLog: Could not sync. Requesting
> > close of hlog
> > java.io.IOException: Reflection
> > Caused by: java.lang.reflect.InvocationTargetException
> > Caused by: java.io.IOException: DFSOutputStream is closed
> > 2013-06-05 05:12:51,180 FATAL
> > org.apache.hadoop.hbase.regionserver.wal.HLog: Could not sync. Requesting
> > close of hlog
> > java.io.IOException: Reflection
> > Caused by: java.lang.reflect.InvocationTargetException
> > Caused by: java.io.IOException: DFSOutputStream is closed
> > 2013-06-05 05:12:51,183 ERROR
> > org.apache.hadoop.hbase.regionserver.wal.HLog: Failed close of HLog
> writer
> > java.io.IOException: Reflection
> > Caused by: java.lang.reflect.InvocationTargetException
> > Caused by: java.io.IOException: DFSOutputStream is closed
> > 2013-06-05 05:12:51,184 WARN
> > org.apache.hadoop.hbase.regionserver.wal.HLog: Riding over HLog close
> > failure! error count=1
> > 2013-06-05 05:12:52,557 FATAL
> > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region
> server
> > hbase.rummycircle.com,60020,1369877672964:
> > regionserver:60020-0x13ef31264d00001 regionserver:60020-0x13ef31264d00001