I have a question related to VM reuse in Hadoop.I now understand the
purpose of VM reuse , but I am wondering how is it useful.
Example. for VM reuse to be effective or kicked in , we need more than one
mapper task to be submitted to a single node (for the same job).Hadoop
would consider spawning mappers into nodes which actually contains the data
, it might rarely happen that multiple mappers are allocated to a single
task tracker. And even if a single task nodes gets to run multiple mappers
then it might as well run in parallel in multiple VM rather than
sequentially in a single VM.
I am sure I am missing some link here , please help me find that.