-Re: When node is down
Michel Segel 2012-06-25, 03:14
You don't notice it faster, it's the timeout.
You can reduce the timeout, it's configurable. Default is 10 min.
There shouldn't be downtime of the cluster, just the node.
Note this is for Apache. MapR is different and someone from MapR should be able to provide details...
Sent from a remote device. Please excuse any typos...
On Jun 22, 2012, at 8:41 AM, Tom Brown <[EMAIL PROTECTED]> wrote:
> Can it notice the node is down sooner? If that node is serving an active
> region (or if it's a datanode for an active region), that would be a
> potentially large amount of downtime. With comodity hardware, and a large
> enough cluster, there will always be a machine or two being rebuilt...
> On Thursday, June 21, 2012, Michael Segel wrote:
>> Assuming that you have an Apache release (Apache, HW, Cloudera) ...
>> (If MapR, replace the drive and you should be able to repair the cluster
>> from the console. Node doesn't go down. )
>> Node goes down.
>> 10 min later, cluster sees node down. Should then be able to replicate the
>> missing blocks.
>> Replace disk w new disk and rebuild file system.
>> Bring node up.
>> Rebalance cluster.
>> That should be pretty much it.
>> On Jun 21, 2012, at 10:17 PM, David Charle wrote:
>>> What is the best practice to remove a node and add the same node back for
>>> hbase/hadoop ?
>>> Currently in our 10 node cluster; 2 nodes went down (bad disk, so node is
>>> down as its the root volume+data); need to replace the disk and add them
>>> back. Any quick suggestions or pointers to doc for the right procedure ?