|
|
-
Hmaster cannot start running again
Dalia Sobhy 2013-01-02, 18:00
Dear all,
I started first 2 region servers and added 6 million records to them. Then added about 3 region servers and everything was fine. I ran a java program on then and they working properly. But after stopping four region servers, the HMASter is not working. I added another region server, but also it doesn't work I dunno why.
>From namenode log file: Safe mode is ON. The reported blocks 43 needs additional 8 blocks to reach the threshold 0.9990 of total blocks 51. Safe mode will be turned off automatically.
>From Hmaster log file: resubmitting task /hbase/splitlog/hdfs%3A%2F%2Fslave7.medcloud.com%3A8020%2Fhbase%2F.logs%2Fslave1.medcloud.com%2C60020%2C1357145569383-splitting%2Fslave1.medcloud.com%252C60020%252C1357145569383.1357145572049
org.apache.hadoop.hbase.master.SplitLogManager task /hbase/splitlog/RESCAN0000000832 entered state done slave7.medcloud.com,60000,1357148608279
org.apache.hadoop.hbase.util.FSUtils Waiting for dfs to exit safe mode...
the last line repeated alot.
>From region server log file:
org.apache.zookeeper.ZooKeeper Client environment:java.library.path=/usr/lib/hadoop/lib/native:/usr/lib/hbase/lib/native/Linux-amd64-64
7:38:58.778 PM INFO org.apache.zookeeper.ZooKeeper Client environment:java.io.tmpdir=/tmp
7:38:58.778 PM INFO org.apache.zookeeper.ZooKeeper Client environment:java.compiler=<NA>
7:38:58.778 PM INFO org.apache.zookeeper.ZooKeeper Client environment:os.name=Linux
7:38:58.778 PM INFO org.apache.zookeeper.ZooKeeper Client environment:os.arch=amd64
7:38:58.778 PM INFO org.apache.zookeeper.ZooKeeper Client environment:os.version=3.2.0-29-generic
7:38:58.778 PM INFO org.apache.zookeeper.ZooKeeper Client environment:user.name=hbase
7:38:58.778 PM INFO org.apache.zookeeper.ZooKeeper Client environment:user.home=/var/lib/hbase
7:38:58.778 PM INFO org.apache.zookeeper.ZooKeeper Client environment:user.dir=/run/cloudera-scm-agent/process/747-hbase-REGIONSERVER
7:38:58.792 PM INFO org.apache.zookeeper.ZooKeeper Initiating client connection, connectString=slave4.medcloud.com:2181 sessionTimeout=60000 watcher=regionserver:60020
7:38:58.924 PM INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper The identifier of this process is [EMAIL PROTECTED]
7:38:58.995 PM INFO org.apache.zookeeper.ClientCnxn Opening socket connection to server slave4.medcloud.com/192.168.0.5:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration)
7:38:59.022 PM INFO org.apache.zookeeper.ClientCnxn Socket connection established to slave4.medcloud.com/192.168.0.5:2181, initiating session
7:38:59.047 PM INFO org.apache.zookeeper.ClientCnxn Session establishment complete on server slave4.medcloud.com/192.168.0.5:2181, sessionid = 0x13bfc435d0c000b, negotiated timeout = 40000
7:39:00.485 PM INFO org.apache.hadoop.hbase.regionserver.ShutdownHook Installed shutdown hook thread: Shutdownhook:regionserver60020 org.apache.zookeeper.server.PrepRequestProcessor Got user-level KeeperException when processing sessionid:0x13bfc435d0c000d type:delete cxid:0xa zxid:0x14a6 txntype:-1 reqpath:n/a Error Path:/hbase/backup-masters/slave7.medcloud.com,60000,1357148608279 Error:KeeperErrorCode = NoNode for /hbase/backup-masters/slave7.medcloud.com,60000,1357148608279 I tried Deploying client configurations, then
+
Dalia Sobhy 2013-01-02, 18:00
-
Re: Hmaster cannot start running again
varun kumar 2013-01-03, 02:43
Hi Daila,
Safemode is on.
Turn Off safemode you will be write files into that cluster.
Hadoop cluster will turn off safemode automatically when the gets it's required blocks.
In your scenario try to start 2 more region server.
Regards, Varun Kumar.P On Wed, Jan 2, 2013 at 11:30 PM, Dalia Sobhy <[EMAIL PROTECTED]>wrote:
> > Dear all, > > I started first 2 region servers and added 6 million records to them. Then > added about 3 region servers and everything was fine. I ran a java program > on then and they working properly. But after stopping four region servers, > the HMASter is not working. I added another region server, but also it > doesn't work I dunno why. > > From namenode log file: > Safe mode is ON. The reported blocks 43 needs additional 8 blocks > to reach the threshold 0.9990 of total blocks 51. Safe mode will be > turned off automatically. > > From Hmaster log file: > resubmitting task /hbase/splitlog/hdfs%3A%2F%2Fslave7.medcloud.com > %3A8020%2Fhbase%2F.logs%2Fslave1.medcloud.com > %2C60020%2C1357145569383-splitting%2Fslave1.medcloud.com > %252C60020%252C1357145569383.1357145572049 > > org.apache.hadoop.hbase.master.SplitLogManager > task /hbase/splitlog/RESCAN0000000832 entered > state done slave7.medcloud.com,60000,1357148608279 > > org.apache.hadoop.hbase.util.FSUtils Waiting > for dfs to exit safe mode... > > the last line repeated alot. > > From region server log file: > > org.apache.zookeeper.ZooKeeper > Client > environment:java.library.path=/usr/lib/hadoop/lib/native:/usr/lib/hbase/lib/native/Linux-amd64-64 > > > > > 7:38:58.778 PM > INFO > org.apache.zookeeper.ZooKeeper > Client environment:java.io.tmpdir=/tmp > > > > > 7:38:58.778 PM > INFO > org.apache.zookeeper.ZooKeeper > Client environment:java.compiler=<NA> > > > > > 7:38:58.778 PM > INFO > org.apache.zookeeper.ZooKeeper > Client environment:os.name=Linux > > > > > 7:38:58.778 PM > INFO > org.apache.zookeeper.ZooKeeper > Client environment:os.arch=amd64 > > > > > 7:38:58.778 PM > INFO > org.apache.zookeeper.ZooKeeper > Client environment:os.version=3.2.0-29-generic > > > > > 7:38:58.778 PM > INFO > org.apache.zookeeper.ZooKeeper > Client environment:user.name=hbase > > > > > 7:38:58.778 PM > INFO > org.apache.zookeeper.ZooKeeper > Client environment:user.home=/var/lib/hbase > > > > > 7:38:58.778 PM > INFO > org.apache.zookeeper.ZooKeeper > Client > environment:user.dir=/run/cloudera-scm-agent/process/747-hbase-REGIONSERVER > > > > > 7:38:58.792 PM > INFO > org.apache.zookeeper.ZooKeeper > Initiating client connection, connectString> slave4.medcloud.com:2181 sessionTimeout=60000 watcher=regionserver:60020 > > > > > 7:38:58.924 PM > INFO > > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper > The identifier of this process is > [EMAIL PROTECTED] > > > > > 7:38:58.995 PM Regards, Varun Kumar.P
+
varun kumar 2013-01-03, 02:43
-
Re: Hmaster cannot start running again
ramkrishna vasudevan 2013-01-03, 04:22
Hi Dalia
The Namenode SafeMode Exception should not be related with number of region servers. Any turn off safemode and try restarting your cluster. May be internally namenode is not able to locate some blocks. That needs more investigation.
Regards Ram
On Thu, Jan 3, 2013 at 8:13 AM, varun kumar <[EMAIL PROTECTED]> wrote:
> Hi Daila, > > Safemode is on. > > Turn Off safemode you will be write files into that cluster. > > Hadoop cluster will turn off safemode automatically when the gets it's > required blocks. > > In your scenario try to start 2 more region server. > > Regards, > Varun Kumar.P > > > On Wed, Jan 2, 2013 at 11:30 PM, Dalia Sobhy <[EMAIL PROTECTED] > >wrote: > > > > > Dear all, > > > > I started first 2 region servers and added 6 million records to them. > Then > > added about 3 region servers and everything was fine. I ran a java > program > > on then and they working properly. But after stopping four region > servers, > > the HMASter is not working. I added another region server, but also it > > doesn't work I dunno why. > > > > From namenode log file: > > Safe mode is ON. The reported blocks 43 needs additional 8 blocks > > to reach the threshold 0.9990 of total blocks 51. Safe mode will be > > turned off automatically. > > > > From Hmaster log file: > > resubmitting task /hbase/splitlog/hdfs%3A%2F%2Fslave7.medcloud.com > > %3A8020%2Fhbase%2F.logs%2Fslave1.medcloud.com > > %2C60020%2C1357145569383-splitting%2Fslave1.medcloud.com > > %252C60020%252C1357145569383.1357145572049 > > > > org.apache.hadoop.hbase.master.SplitLogManager > > task /hbase/splitlog/RESCAN0000000832 entered > > state done slave7.medcloud.com,60000,1357148608279 > > > > org.apache.hadoop.hbase.util.FSUtils Waiting > > for dfs to exit safe mode... > > > > the last line repeated alot. > > > > From region server log file: > > > > org.apache.zookeeper.ZooKeeper > > Client > > > environment:java.library.path=/usr/lib/hadoop/lib/native:/usr/lib/hbase/lib/native/Linux-amd64-64 > > > > > > > > > > 7:38:58.778 PM > > INFO > > org.apache.zookeeper.ZooKeeper > > Client environment:java.io.tmpdir=/tmp > > > > > > > > > > 7:38:58.778 PM > > INFO > > org.apache.zookeeper.ZooKeeper > > Client environment:java.compiler=<NA> > > > > > > > > > > 7:38:58.778 PM > > INFO > > org.apache.zookeeper.ZooKeeper > > Client environment:os.name=Linux > > > > > > > > > > 7:38:58.778 PM > > INFO > > org.apache.zookeeper.ZooKeeper > > Client environment:os.arch=amd64 > > > > > > > > > > 7:38:58.778 PM > > INFO > > org.apache.zookeeper.ZooKeeper > > Client > environment:os.version=3.2.0-29-generic > > > > > > > > > > 7:38:58.778 PM > > INFO > > org.apache.zookeeper.ZooKeeper > > Client environment:user.name=hbase > > > > > > > > > > 7:38:58.778 PM > > INFO > > org.apache.zookeeper.ZooKeeper > > Client environment:user.home=/var/lib/hbase > > > > > > > > > > 7:38:58.778 PM > > INFO > > org.apache.zookeeper.ZooKeeper > > Client > > > environment:user.dir=/run/cloudera-scm-agent/process/747-hbase-REGIONSERVER
+
ramkrishna vasudevan 2013-01-03, 04:22
|
|