Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> All region server died due to "Parent directory doesn't exist"


+
lars hofhansl 2013-05-09, 06:39
+
lars hofhansl 2013-05-09, 07:23
Copy link to this message
-
Re: All region server died due to "Parent directory doesn't exist"
Potential jiras that went into 0.94.7 that could be responsible:
HBASE-7824
HBASE-8246
HBASE-8276
HBASE-8288
HBASE-8212
HBASE-8081
HBASE-8211
HBASE-8211
-- Lars

----- Original Message -----
From: lars hofhansl <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>; lars hofhansl <[EMAIL PROTECTED]>
Cc:
Sent: Thursday, May 9, 2013 12:23 AM
Subject: Re: All region server died due to "Parent directory doesn't exist"

All the directories in .logs have the -splitting suffix, so this seems by design.
The problem is that even though all logs are split, each time I startup a region server now, its log dir is renamed to ...-splitting and the region server shuts itself down.

-- Lars

----- Original Message -----
From: lars hofhansl <[EMAIL PROTECTED]>
To: hbase-dev <[EMAIL PROTECTED]>
Cc:
Sent: Wednesday, May 8, 2013 11:39 PM
Subject: All region server died due to "Parent directory doesn't exist"

We just had all RegionServers die in a test cluster. All with the following exception.
(This is CDH4.2.1 with HBase 0.94.7 build against it)

Strangely HDFS is up and running (I can ls all directories, create files in it, etc. HDFS's fsck reports that all is well), yet we had the RSs die with this.
This almost looks like a race where the directories under .logs were yanked away while they were still in use.

I plan to investigate this further. In any event, has anybody seen this issue (or anything similar to this) before?
When this happened there was no load on the cluster (other than some write from OTSDB).

Thanks.

-- Lars

2013-05-08 16:02:41,178 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server <host>,60020,1367614452787: IOE in log roller
java.io.IOException: Exception in createWriter
        at org.apache.hadoop.hbase.regionserver.wal.HLogFileSystem.createWriter(HLogFileSystem.java:66)
        at org.apache.hadoop.hbase.regionserver.wal.HLog.createWriterInstance(HLog.java:715)
        at org.apache.hadoop.hbase.regionserver.wal.HLog.rollWriter(HLog.java:648)
        at org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:95)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: cannot get log writer
        at org.apache.hadoop.hbase.regionserver.wal.HLog.createWriter(HLog.java:771)
        at org.apache.hadoop.hbase.regionserver.wal.HLogFileSystem.createWriter(HLogFileSystem.java:60)
        ... 4 more
Caused by: java.io.IOException: java.io.FileNotFoundException: Parent directory doesn't exist: /hbase/.logs/<host>,60020,1367614452787
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.verifyParentDir(FSNamesystem.java:1726)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1848)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:1770)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1747)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:418)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:205)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44068)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1689)

        at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.init(SequenceFileLogWriter.java:173)
        at org.apache.hadoop.hbase.regionserver.wal.HLog.createWriter(HLog.java:768)
        ... 5 more
+
Ted Yu 2013-05-09, 08:33
+
Andrew Purtell 2013-05-09, 08:59
+
Ted Yu 2013-05-09, 09:04
+
Andrew Purtell 2013-05-09, 09:06
+
lars hofhansl 2013-05-09, 15:48
+
Ted Yu 2013-05-09, 16:07
+
lars hofhansl 2013-05-09, 16:16
+
Varun Sharma 2013-05-09, 16:39
+
Varun Sharma 2013-05-09, 16:41
+
Ted Yu 2013-05-09, 16:51
+
lars hofhansl 2013-05-09, 17:03
+
Stack 2013-05-09, 17:34
+
lars hofhansl 2013-05-09, 18:13
+
lars hofhansl 2013-05-09, 18:28
+
Enis Söztutar 2013-05-10, 01:10
+
lars hofhansl 2013-05-10, 04:25
+
Enis Söztutar 2013-05-10, 05:01
+
lars hofhansl 2013-05-10, 05:47
+
lars hofhansl 2013-05-09, 16:38
+
takeshi 2014-02-19, 03:18
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB