Usually nodes are decommissioned slowly over some period of time so as not to disrupt the running jobs. When a node is decommissioned, the NameNode must re-replicate all under-replicated blocks. Rather than suddenly remove half the nodes, you might want to take a few nodes offline at a time. Hadoop should be able to handle rescheduling tasks on nodes no longer available (even without speculative execution. Speculative execution is for something else).
On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <[EMAIL PROTECTED]> wrote:
I’m running a Hadoop cluster on AWS EC2,
I would like to dynamically resizing the cluster so as to reduce the cost, is there any solution to achieve this?
E.g. I would like to cut the cluster size with a half, is it safe to just shutdown the instances (if some tasks are just running on them, can I rely on the speculative execution to re-run them on the other nodes?)
I cannot use EMR, since I’m running a customized version of Hadoop
School of Computer Science,