On Fri, Jun 22, 2012 at 6:41 AM, Tom Brown <[EMAIL PROTECTED]> wrote:
> Can it notice the node is down sooner? If that node is serving an active
> region (or if it's a datanode for an active region), that would be a
> potentially large amount of downtime. With comodity hardware, and a large
> enough cluster, there will always be a machine or two being rebuilt...
I still see 4 live serves and 0 dead servers out of 5 even the other node
(processes) are down for more than 24 hrs.
> On Thursday, June 21, 2012, Michael Segel wrote:
> > Assuming that you have an Apache release (Apache, HW, Cloudera) ...
> > (If MapR, replace the drive and you should be able to repair the cluster
> > from the console. Node doesn't go down. )
> > Node goes down.
> > 10 min later, cluster sees node down. Should then be able to replicate
> > missing blocks.
> > Replace disk w new disk and rebuild file system.
> > Bring node up.
> > Rebalance cluster.
> > That should be pretty much it.
> > On Jun 21, 2012, at 10:17 PM, David Charle wrote:
> > > What is the best practice to remove a node and add the same node back
> > > hbase/hadoop ?
> > >
> > > Currently in our 10 node cluster; 2 nodes went down (bad disk, so node
> > > down as its the root volume+data); need to replace the disk and add
> > > back. Any quick suggestions or pointers to doc for the right procedure
> > >
> > > --
> > > David