"it is unclear to me if the transition in this case is also rapid but
the fencing takes long while the new namenode is already active, or if
in this period i am stuck without an active namenode."
The standby->active transition will get stuck in this period, i.e.,
the NN can only become active after fencing the old active NN. During
this period since the only NN is in standby state which cannot handle
usual R/W operations and just throws StandbyException, hbase region
server may kill itself in some cases I guess.
I think you can remove sshfence from the configuration if you are
using QJM-based HA.
On Fri, Oct 11, 2013 at 4:51 PM, Koert Kuipers <[EMAIL PROTECTED]> wrote:
> i have been playing with high availability using journalnodes and 2 masters
> both running namenode and hbase master.
> when i kill the namenode and hbase-master processes on the active master,
> the failover is perfect. hbase never stops and a running map-reduce jobs
> keeps going. this is impressive!
> however when instead of killing the proceses i kill the entire active master
> machine, the transactions is less smooth and can take a long time, at least
> it seems this way in the logs. this is because ssh fencing fails but keeps
> trying. my fencing is configured as:
> it is unclear to me if the transition in this case is also rapid but the
> fencing takes long while the new namenode is already active, or if in this
> period i am stuck without an active namenode. it is hard to accurately test
> this in my setup.
> is this supposed to take this long? is HDFS writable in this period? and is
> hbase supposed to survive this long transition?
> thanks! koert
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.