Thank you for the reply
Actually I'm not running HDFS on EC2, instead I use S3 to store data
I'm curious about that, if some nodes are decommissioned, the JobTracker
will deal those tasks which originally ran on them as "too slow" (since no
progress for a long time) so to run speculative execution OR it directly
treats them as "belonging to a running Job and ran on a dead TaskTracker"?
On Thu, Oct 24, 2013 at 2:04 PM, Ravi Prakash <[EMAIL PROTECTED]> wrote:
> Hi Nan!
> Usually nodes are decommissioned slowly over some period of time so as not
> to disrupt the running jobs. When a node is decommissioned, the NameNode
> must re-replicate all under-replicated blocks. Rather than suddenly remove
> half the nodes, you might want to take a few nodes offline at a time.
> Hadoop should be able to handle rescheduling tasks on nodes no longer
> available (even without speculative execution. Speculative execution is for
> something else).
> On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <[EMAIL PROTECTED]>
> Hi, all
> I’m running a Hadoop cluster on AWS EC2,
> I would like to dynamically resizing the cluster so as to reduce the cost,
> is there any solution to achieve this?
> E.g. I would like to cut the cluster size with a half, is it safe to just
> shutdown the instances (if some tasks are just running on them, can I rely
> on the speculative execution to re-run them on the other nodes?)
> I cannot use EMR, since I’m running a customized version of Hadoop
> Nan Zhu
> School of Computer Science,
> McGill University
School of Computer Science,
E-Mail: [EMAIL PROTECTED] <[EMAIL PROTECTED]>